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PREFACE 


Begin at the beginning, and do not allow yourself to gratify 
a mere idle curiosity by dipping into the book, here and there. 

This would very likely lead to your throwing it aside, 
with the remark “ 丁 his is much too hard for me!，” 
and thus losing the chance of adding a very large item 

to your stock of mental delights. 

—— LEWIS CARROLL, in Symbolic Logic (1896) 


This booklet contains draft material that Pm circulating to experts in the 
field, in hopes that they can help remove its most egregious errors before too 
many other people see it. I am also ， however，posting it on the Internet for 
courageous and/or random readers who don’t mind the risk of reading a few 
pages that have not yet reached a very mature state. Beware: This material 
has not yet been proofread as thoroughly as the manuscripts of Volumes 1,2, 
3, and 4A were at the time of their first printings. And those carefully-checked 
volumes ， alas，were subsequently found to contain thousands of mistakes. 

Given this caveat, I hope that my errors this time will not be so numerous 
and/or obtrusive that you will be discouraged from reading the material carefully. 
I did try to make the text both interesting and authoritative, as far as it goes. 
But the field is vast; I cannot hope to have surrounded it enough to corral it 
completely. So I beg you to let me know about any deficiencies that you discover. 

To put the material in context, this portion of fascicle 5 previews the opening 
pages of Section 7.2.2 of The Art of Computer Programming^ entitled “Backtrack 
programming •” The preceding section ， 7.2.1，was about “Generating basic com¬ 
binatorial patterns” 一 namely tuples, permutations, combinations, partitions, 
and trees. Now it’s time to consider the non-basic patterns，the ones that have 
a much less uniform structure. For these we generally need to make tentative 
choices and then we need to back up when those choices need revision. Several 
subsections (7.2.2.1, 7.2.2.2, etc.) will follow this introductory material. 


氺氺氺 


The explosion of research in combinatorial algorithms since the 1970s has 
meant that I cannot hope to be aware of all the important ideas in this field. 
I’ve tried my best to get the story right, yet I fear that in many respects I’m 
woefully ignorant. So I beg expert readers to steer me in appropriate directions. 
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iv PREFACE 

Please look, for example, at the exercises that I’ve classed as research 
problems (rated with difficulty level 46 or higher)，namely exercises 14,… ； I’ve 
also implicitly mentioned or posed additional unsolved questions in the answers 
to exercises 6 , 8 , 42, 45, … • Are those problems still open? Please inform me if 
you know of a solution to any of these intriguing questions. And of course if no 
solution is known today but you do make progress on any of them in the future, 
I hope you’ll let me know. 

I urgently need your help also with respect to some exercises that I made 
up as I was preparing this material. I certainly don’t like to receive credit for 
things that have already been published by others, and most of these results are 
quite natural “fruits” that were just waiting to be “plucked •” Therefore please 
tell me if you know who deserves to be credited, with respect to the ideas found 
in exercises 31(b) ， 33, 44, 50, 51 ， 95, ... . Furthermore I’ve credited exercises … 
to unpublished work of ... . Have any of those results ever appeared in print, to 
your knowledge? 

I’ve got a historical question too: Have you any idea who originated the 
idea of “stamping” in data structures? (See 7.2. 2 -( 26 ). This concept is quite 
different from the so-called time stamps in persistent data structures, and quite 
different from the so-called time stamps in depth-first search algorithms，and 
quite different from the so-called time stamps in cryptology, although many 
programmers do use the name “time stamp” for those kinds of stamp.) It’s 
a technique that I’ve seen often, in programs that have come to my attention 
during recent decades, but I wonder if it ever appeared in a book or paper that 
was published before ， say, 1980. 


氺氺氺 

Special thanks are due to … for their detailed comments on my early attempts 
at exposition，as well as to numerous other correspondents who have contributed 
crucial corrections. 


氺氺氺 

I happily offer a “finder’s fee” of $2.56 for each error in this draft when it is first 
reported to me，whether that error be typographical, technical, or historical. 
The same reward holds for items that I forgot to put in the index. And valuable 
suggestions for improvements to the text are worth 32^ each. (Furthermore, if 
you find a better solution to an exercise, Ill actually do my best to give you 
immortal glory，by publishing your name in the eventual book:—) 

Cross references to yet-unwritten material sometimes appear as c 00 5 ; this 
impossible value is a placeholder for the actual numbers to be supplied later. 

Happy reading! 

Stanford，California D. E. K. 

99 Umbruary 2015 


stamping 

Knuth 
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MPR 


Part of the Preface to Volume 4B 

During the years that I’ve been preparing Volume 4 ， I’ve often run across 
basic techniques of probability theory that I would have put into Section 1.2 
of Volume 1 if Vd been clairvoyant enough to anticipate them in the 1960s. 
Finally I realized that I ought to collect most of them together in one place ， 
near the beginning of Volume 4B，because the story of these developments is too 
interesting to be broken up into little pieces scattered here and there. 

Therefore this volume begins with a special section entitled “Mathematical 
Preliminaries Redux，” and future sections use the abbreviation C MPR’ to refer 
to its equations and its exercises. 
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MATHEMATICAL PRELIMINARIES REDUX 


Many parts of this book deal with discrete probabilities, namely with a finite or 
countably infinite set ft of atomic events cj ，each of which has a given probability 
Pr(cj)，where 

0 < Pr(cj) < 1 and ^ Pr(cj) = 1. (l) 

(jJ 


/^\ For the complete text of the special MPR section，please see Pre-Fascicle 5a. 
JL Incidentally，Section 7.2.2 intentionally begins on a left-hand page ，and its 
illustrations are numbered beginning with Fig. 68, because Section 7.2.1 ended 
on a right-hand page and its final illustration was Fig. 67. The editor has decided 
to treat Chapter 7 as a single unit ，even though it will be split across several 
physical volumes. 
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2 COMBINATORIAL ALGORITHMS (F5B: 21 Dec 2015 ◎ 1742) 

Nowhere to go but out, 
Nowhere to come but back. 

— BEN KING, in The Sum of Life (c. 1893) 

No one I think is in my tree. 
— JOHN LENNON，in Strawberry Fields Forever (1967) 

7.2.2. Backtrack Programming 

Now that we know how to generate simple combinatorial patterns such as tuples ， 
permutations, combinations, partitions, and trees, we’re ready to tackle more 
exotic patterns that have subtler and less uniform structure. Instances of almost 
any desired pattern can be generated systematically, at least in principle, if we 
organize the search carefully. Such a method was christened “backtrack” by 
R. J. Walker in the 1950s，because it is basically a way to examine all fruitful 
possibilities while exiting gracefully from situations that have been fully explored. 

Most of the patterns we shall deal with can be cast in a simple ， gen¬ 
eral framework: We seek all sequences X 1 X 2 • • - x n for which some property 
P n (xi^X 2 ^... ， x n ) holds，where each item Xk belongs to some given domain 
of integers. The backtrack method, in its most elementary form, consists of 
inventing intermediate “cutoff” properties P[(xi^... ^xi) for 1 < / < n，such 
that 

Pi (x\ , … ， : r/) is true whenever ... ， : r ， +i) is true; (i) 

Pi(x\^... ^xi) is fairly easy to test，if ... ，心 —i) holds. ( 2 ) 

(We assume that Po() is always true. Exercise 1 shows that all of the basic 
patterns studied in Section 7.2.1 can easily be formulated in terms of domains 
and cutoff properties Pi.) Then we can proceed lexicographically as follows: 

Algorithm B (Basic backtrack). Given domains and properties Pi as above ， 
this algorithm visits all sequences X 1 X 2 • ^x n that satisfy ($i ， $ 2 ， •… ， $n) • 

Bl. [Initialize.] Set Z 卜 1， and initialize the data structures needed later. 

B2. [Enter level L] (Now Pi-i($i, … ，❿ -i) holds.) If Z > n，visit X 1 X 2 •••〜 
and go to B5. Otherwise set x\ 4 - rmnD^ the smallest element of D\. 

B3. [Try x\\ If • • ， xi) holds，update the data structures to facilitate 

testing P/ + i, set /*<—/ + 1 5 and go to B2. 

B4. [Try again」If x\ / max D \, set x\ to the next larger element of D\ and 
return to B3. 

B5. [Backtrack.] Set / ^ —1. If / > 0, downdate the data structures by undoing 
the changes recently made in step B3, and return to B4. (Otherwise stop.) | 

The main point is that if Pi(x ： [ ， … ^xi) is false in step B3, we needn’t waste time 
trying to append any further values xi^i ... x n . Thus we can often rule out huge 
regions of the space of all potential solutions. A second important point is that 
very little memory is needed, although there may be many, many solutions. 

For example, let’s consider the classic problem of n queens: In how many 
ways can n queens be placed on an n x n board so that no two are in the same 


KING 

LENNON 

backtrack 
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cutoff 

properties: logical propositions (relations) 
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lexicographically 
71 queens— 
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row ， column, or diagonal? We can suppose that one queen is in each row, and 
that the queen in row k is in column Xk, for 1 < A: < n. Then each domain 
is { 1 ， 2 ,… ， n}; and (: ri ， … ^x n ) is the condition that 

Xj / Xk and \xk — Xj \ /众 一 j ， for 1 < j < k < n. ( 3 ) 

(If Xj = Xk and j < two queens are in the same column; if \xk — Xj \ = k — j , 
they’re in the same diagonal.) 

This problem is easy to set up for Algorithm B, because we can let property 
Pi(xi^... ^xi) be the same as ( 3 ) but restricted to 1 < j < k < l• Condition ( 1 ) 
is clear; and so is condition ( 2 )，because Pi requires testing ( 3 ) only for A: = / 
when P/_i is known. Notice that Pi(xi) is always true in this example. 

One of the best ways to learn about backtracking is to execute Algorithm B 
by hand in the special case n = 4 of the n queens problem: First we set x\ 1. 
Then when I = 2 we find 巧（ 1 ， 1) and 巧（ 1 ， 2) false; hence we don’t get to / = 3 
until trying $ 2 卜 3. Then ， however ， we’re stuck, because P 3 (l ， 3 ， :r) is false for 
1 < x < 4. Backtracking to level 2, we now try 奶卜 4; and this allows us to 
set xs ^r- 2. However, we’re stuck again, at level 4; and this time we must back 
up all the way to level 1， because there are no further valid choices at levels 3 
and 2. The next choice xi ^ 2 does, happily, lead to a solution without much 
further ado, namely X 1 X 2 XSX 4 = 2413. And one more solution (3142) turns up 
before the algorithm terminates. 

The behavior of Algorithm B is nicely visualized as a tree structure, called a 
search tree or backtrack tree. For example，the backtrack tree for the four queens 
problem has just 17 nodes, 


( 4 ) 

2 

corresponding to the 17 times step B2 is performed. Here x\ is shown as the 
label of an edge from level / — 1 to level l of the tree. (Level l of the algorithm 
actually corresponds to the tree’s level l — 1 5 because we’ve chosen to represent 
patterns using subscripts from 1 to n instead of from 0 to n — 1 in this discussion.) 
The profile Opo ， Pi ， … ， Pn) of this particular tree — the number of nodes at each 
level — is (1 ， 4,6,4,2); and we see that the number of solutions, p n = J 94 , is 2. 

Figure 68 shows the corresponding tree when n = 8 . This tree has 2057 
nodes, distributed according to the profile (1,8,42,140,344,568,550,312,92). 

Thus the early cutoffs facilitated by backtracking have allowed us to find all 
92 solutions by examine only 0.01% of the 8 8 = 16,777,216 possible sequences 
x\ .. .xg- (And 8 8 is only 0.38% of the ( 6 g 4 ) = 4,426,165,368 ways to put eight 

queens on the board.) 



diagonal 
backtrack tree 
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Fig. 68 - The problem of placing eight nonattacking queens has this backtrack tree. 


data structures— 
mems 

downdating vs updating+ 
undoes 


Notice that, in this case, Algorithm B spends most of its time in the vicinity 
of level 5. Such behavior is typical: The backtrack tree for n = 16 queens has 
1,141,190,303 nodes, and its profile is (1, 16, 210, 2236, 19688, 141812, 838816, 
3998456, 15324708, 46358876, 108478966, 193892860, 260303408, 253897632, 
171158018, 72002088, 14772512), concentrated near level 12. 

Data structures. Backtrack programming is often used when a huge tree of 
possibilities needs to be examined. Thus we want to be able to test property Pi 
as quickly as possible in step B3. 

One way to implement Algorithm B for the n queens problem is to avoid 
auxiliary data structures and simply to make a bunch of sequential comparisons 
in that step: u Is x\ — Xj G {j — l ， 0，l — j} for some j < Z?” Assuming that we 
access memory whenever referring to Xj, given a trial value x\ in a register, such 
an implementation performs approximately 112 billion memory accesses when 
n = 16; that’s about 98 mems per node. 

We can do better by introducing three simple arrays. Property Pi in ( 3 ) 
says essentially that the numbers Xk are distinct, and so are the numbers Xk + &， 
and so are the numbers Xk — k. Therefore we can use auxiliary Boolean arrays 
ai … a n ， bi ... b‘ 2 n _i, and ci … C 2 n -i，where aj means c some Xk = j\ bj means 
c some Xk + k _ 1 = j’, and Cj means c some Xk — k + n = j\ Those arrays are 
readily updated and downdated if we customize Algorithm B as follows: 

Bl*. [Initialize.] Set a\ ... a n ^ 0 • • • 0 ， b\ ... b‘ 2 n -i <— 0 • • • 0 ， c\ ... C 2 n -i 
0 ••• 0 , and l ^ 1. 

B2*. [Enter level L] (Now ，… ，: r ， —i) holds.) If l > n, visit X 1 X 2 …: r n 

and go to B5*. Otherwise set t 1. 

B3*. [Try t] If = 1 or = 1 or Q_，+ n = 1， go to B4*. Otherwise set 

at t 1, 卜 1 t 1, Q-z+ n 1, / -f 1, and go to B2*. 

B4*. [Try again.] If t < set t ^ t + 1 and return to B3*. 

B5*. [Backtrack.] Set l i- l — 1. If / > 0, set t ^ 1 0, 6 奸 /—1 0, 

a t ^ 0, and return to B4*. (Otherwise stop.) | 

Notice how step B5* neatly undoes the updates that step B3* had made，in the 
reverse order. Reverse order for downdating is typical of backtrack algorithms ， 
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although there is some flexibility; we could, for example，have restored a t before 
bt+i-i and Q_/+ n ，because those arrays are independent. 

The auxiliary arrays a ， 6 , c make it easy to test property Pi at the beginning 
of step B3*，but we must also access memory when we update them and downdate 
them. Does that cost us more than it saves? Fortunately, no: The running time 
for n = 16 goes down to about 34 billion mems，roughly 30 mems per node. 

Furthermore we could keep the bit vectors a ， 6 , c entirely in registers, on a 
machine with 64-bit registers，assuming that n < 32. Then there would be just 
two memory accesses per node, namely to store xi ^ t and later to fetch t ^ xi] 
however, quite a lot of in-register computation would become necessary. 

Walker’s method. The 1950s-era programs of R. J. Walker organized back¬ 
tracking in a somewhat different way. Instead of letting x\ run through all 
elements of he calculated and stored the set 

Si i- {x e Di \ .. ,xi-i,x) holds} ( 5 ) 

upon entry to each node at level l. This computation can often be done efficiently 
all at once, instead of piecemeal, because some cutoff properties make it possible 
to combine steps that would otherwise have to be repeated for each x ^ D[. In 
essence, he used the following variant of Algorithm B: 

Algorithm W ( Walker’s backtrack). Given domains and cutoffs Pi as above ， 
this algorithm visits all sequences ^ 1 ^ 2 … that satisfy P n (xi,X 2 , … ， x n ). 

Wl. [Initialize.] Set Z 卜 1， and initialize the data structures needed later. 

W2. [Enter level L] (Now • • • ，抑 - 1 ) holds.) If l > n, visit X 1 X 2 …: 

and go to W4. Otherwise determine the set Si as in ( 5 ). 

W3. [Try to advance.] If Si is nonempty, set xi 4 - min 5/， update the data 
structures to facilitate computing 3 + 1 ， set Z 卜 Z + 1， and go to W2. 

W4. [Backtrack.] Set l <- l — 1. If Z > 0， downdate the data structures by 
undoing changes made in step W3, set Si ^ Si\x^ and retreat to W3. | 

Walker applied this method to the n queens problem by computing Si = 

U \ Ai \ Bi \C“ where U = D[ = {1 ， ... ， n} and 
A = {Xj I l<i</}, Bi = {xj-\-j-l\l<j<l}, Ci = {xj-j-\-l \ l<j<l}. (6) 

He represented these auxiliary sets by bit vectors a, b ， c, analogous to (but 
different from) the bit vectors of Algorithm B* above. Exercise 9 shows that 
the updating in step W3 is easy, using bitwise operations on n-bit numbers; 
furthermore, no downdating is needed in step W4. The corresponding run time 
when n = 16 turns out to be just 9.1 gigamems, or 8 mems per node. 

Let Q{n) be the number of solutions to the n queens problem. Then we have 

n = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

Q(n) = 1 1 0 0 2 10 4 40 92 352 724 2680 14200 73712 365596 2279184 14772512 

and the values for n <11 were computed independently by several people during 
the nineteenth century. Small cases were relatively easy; but when T. B. Sprague 
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6 COMBINATORIAL ALGORITHMS (F5B: 21 Dec 2015 @ 1742) 7.2.2 

had finished computing Q(ll) he remarked that “This was a very heavy piece of 
work, and occupied most of my leisure time for several months. ... It will, I imag¬ 
ine, be scarcely possible to obtain results for larger boards，unless a number of 
persons co-operate in the work •” [See Proc. Edinburgh Math. Soc. 17 (1899) ， 43- 
68 ; Sprague was the leading actuary of his day.] Nevertheless, H. Onnen went on 
to evaluate Q(12) = 14,200 — an astonishing feat of hand calculation — in 1910. 
[See W. Ahrens ， Math. Unterhaltungen und Spiele 2 ， second edition (1918) ， 344.] 

All of these hard-won results were confirmed in 1960 by R. J. Walker, 
using the SWAC computer at UCLA and the method of exercise 9. Walker also 
computed Q(13); but he couldn’t go any further with the machine available to 
him at the time. The next step ， Q(14 ) ， was computed by Michael D. Kennedy at 
the University of Tennessee in 1963, commandeering an IBM 1620 for 120 hours. 
S. R. Bunch evaluated Q(15) in 1974 at the University of Illinois，using about 
two hours on an IBM System 360-75; then J. R. Bitner found Q(16) after about 
three hours on the same computer, but with an improved method. 

Computers and algorithms have continued to get better，of course，and such 
results are now obtained almost instantly. Hence larger and larger values of n lie 

at the frontier. The current record as of 2015 is Q(26) = 22,317,699,616,364,044, 
found in 2009 by Thomas B. PreuBer of the University of Dresden. (His dis¬ 
tributed computation occupied a dynamic cluster of up to 26 diverse FPGA 
devices for 270 days; those devices provided a total peak of 550 custom-designed 
hardware solvers to handle 25,204,802 subproblems individually.) 

Permutations and Langford pairs. Every solution x\ ... x n to the n queens 
problem is a permutation of { 1 ， … ， n}，and many other problems are permu¬ 
tation-based. Indeed ， we’ve already seen Algorithm 7.2.1.2X, which is an ele¬ 
gant backtrack procedure specifically designed for special kinds of permutations. 
When that algorithm begins to choose the value of x \, it makes all of the appropri¬ 
ate elements { 1 ， 2 ,… ， n} \ {x\^ ... ， : r ， -i} conveniently accessible in a linked list. 

We can get further insight into such data structures by returning to the 
problem of Langford pairs, which was discussed at the very beginning of Chap¬ 
ter 7. That problem can be reformulated as the task of finding all permutations 
of {1 ， 2, • • • ， n} U { — 1 ，一 2, • • • ， —n} with the property that 

Xj = k implies Xj^k+i = — 尧， for 1 < j < 2 n and 1 < k < n. ( 7 ) 

For example，when n = 4 there are two solutions, namely 23421314 and 41312432. 
(As usual we find it convenient to write I for —1，2 for — 2 ， etc.) Notice that 
whenever x = X 1 X 2 ... x^n is a solution，its “dual” —x R = (—$ 2 n ) … （一 $ 2 )(—$i) 
is also a solution. 

Here’s a Langford-inspired adaptation of Algorithm 7.2.1.2X, with the for¬ 
mer notation modified slightly to match Algorithms B and W: We want to main¬ 
tain pointers popi .. .p n such that, if the positive integers not already present in 
x\ ... xi-i are k\ < h ：2 < ••• < k t when we’re choosing xi, we have the linked list 

PO — — , . . . ， Pkt-i ~ Pkt ~ (8) 

Such a condition turns out to be easy to maintain. 
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Algorithm L (Langford pairs). This algorithm visits all solutions x\ .. .x^n 
to ( 7 ) in lexicographic order，using pointers poPi … p n that satisfy ( 8 )，and also 
using an auxiliary array yi ... y 2 n for backtracking. 

LI. [Initialize.] Set x\ .. .x^n f 0 ••• 0 ,说卜 k + 1 for 0 < ^ < n, p n ^ 0, / 1. 

L2. [Enter level L] Set k ^ po. If & = 0， visit X 1 X 2 .. -X 2 n and go to L5. 
Otherwise set j 卜 0, and while x\ < 0 set / 4- / + 1. 

L3. [Try x\ — k] (At this point we have k = pj.) If / + A; + 1 > 2n, go to L5. 

Otherwise, if x i+k +i = 0, set x t < - k, x t+k +i < - k, y t 4 - j, pj p k , 

/ ^ + 1, and return to L2. 

L4. [Try again.] (WeVe found all solutions that begin with x\ .. or 

something smaller.) Set j ^ k and k pj, then go to L3 if ^ / 0. 

L5. [Backtrack.] Set l t l — 1. If Z > 0 do the following: While 抑 < 0， set 

Z l Z — 1. Then set k ^ x\ 4-0, x^k+i 0 5 j yi，Pj t k, and go 

back to L4. Otherwise terminate the algorithm. | 

Careful study of these steps will reveal how everything fits together nicely. Notice 
that, for example，step L3 removes k from the linked list ( 8 ) by simply setting 

Pj pk- That step also sets xi^k-\-i < -&， in accordance with ( 7 ), so that we 

can skip over position Z + 众 + 1 when we encounter it later in step L2. 

The main point of Algorithm L is the somewhat subtle way in which step L5 
undoes the deletion operation by setting pj i- k. The pointer still retains the 
appropriate link to the next element in the list ， because pk has not been changed 
by any of the intervening updates. (Think about it.) This is the germ of an idea 
called “dancing links” that we will explore in Section 7.2.2.1. 

To draw the search tree corresponding to a run of Algorithm L，we can label 
the edges with the positive choices of x\ as we did in ( 4 )，while labeling the 
nodes with any previously set negative values that are passed over in step L2. 
For instance the tree for n = 4 is 


( 9 ) 

丄 1 --- l 2 - 

*314 *432 

Solutions appear at depth n in this tree，even though they involve 2n values 

X\ X"2 • • • • 

Algorithm L sometimes makes false starts and doesn’t realize the problem 
until probing further than necessary. Notice that the value x\ — h can appear 
only when / + A: + 1 < 2 n; hence if we haven’t seen k by the time l reaches 
2n — k — 1^ we’re forced to choose x\ — h. For example，the branch 12l in ( 9 ) 
needn’t be pursued, because 4 must appear in {ti ， $ 2 , $ 3 }. Exercise 20 explains 
how to incorporate this cutoff principle into Algorithm L. When n = 17， it 
reduces the number of nodes in the search tree from 1.29 trillion to 330 billion ， 
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and reduces the running time from 25.0 teramems to 8.1 teramems. (The amount 
of work has gone from 19.4 mems per node to 24.4 mems per node, because of 
the extra tests for cutoffs, yet there’s a significant overall reduction.) 

Furthermore，we can “break the symmetry” by ensuring that we don’t 
consider both a solution and its dual. This idea，exploited in exercise 21， reduces 
the search tree to just 160 billion nodes and costs just 3.94 teramems — that’s 
24.6 mems per node. 

Word rectangles. Let’s look next at a problem where the search domains D\ 
are much larger. An m x n word rectangle is an array of n-letter words* whose 
columns are m-letter words. For example, 

status 

lowest 

utopia (io) 

making 

sledge 

is a 5 x 6 word rectangle whose columns all belong to W0RDS(5757)，the collection 
of 5-letter words in the Stanford GraphBase. To find such patterns, we can sup¬ 
pose that column l contains the x/th most common 5-letter word, where 1 < xi < 

5757 for 1 < ； < 6; hence there are 5757 6 = 36,406,369,848,837,732,146,649 ways 
to choose the columns. In ( 10 ) we have xi .. .xq = 1446 185 1021 2537 66 255. 
Of course very few of those choices will yield suitable rows; but backtracking will 
hopefully help us to find all solutions in a reasonable amount of time. 

We can set this problem up for Algorithm B by storing the n-letter words 
in a trie (see Section 6.3), with one trie node of size 26 for each /-letter prefix of 
a legitimate word, 0 <1 <n. 

For example，such a trie for n = 6 represents 15727 words with 23667 nodes. 
The prefix st corresponds to node number 260, whose 26 entries are 

(484,0,0,0,1589,0,0,0,2609,0,0,0,0,0,1280,0,0,251,0,0,563,0,0,0,1621,0); ( 11 ) 

this means that sta is node 484 ， ste is node 1589 ， …， sty is node 1621， and 
there are no 6 -letter words beginning with stb ， stc ， … ， stx ， stz. A slightly 
different convention is used for prefixes of length n — 1; for example，the entries 
for node 580, ‘corne ’， are 

(3879,0,0,3878,0,0,0,0,0,0,0,9602,0,0,0,0,0,171,0,5013,0,0,0,0,0,0), ( 12 ) 

meaning that cornea, corned, cornel, corner, and cornet are ranked 3879, 

3878, 9602, 171, and 5013 in the list of 6-letter words. 

* Whenever five-letter words are used in the examples of this book, they’re taken from the 
5757 Stanford GraphBase words as explained at the beginning of Chapter 7. Words of other 
lengths are taken from the The Official SCRABBLE® Players Dictionary^ fourth edition (Hasbro, 
2005), because those words have been incorporated into many widely available computer games. 
Such words have been ranked according to the British National Corpus of 2007 — where ‘the’ 
occurs 5,405,633 times and the next-most common word, ‘of’，occurs roughly half as often 
(3,021,525). The OSPD4 list includes respectively (101 ， 1004, 4002, 8887, 15727, 23958, 29718, 
29130, 22314) words of lengths (2, 3, •… ， 10)，of which (97, 771 ， 2451 ， 4474, 6910, 8852, 9205, 
8225, 6626) occur at least six times in the British National Corpus. 


break the symmetry 
dual 
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Suppose x\ and X 2 specify the 5-letter column-words slums and total as 
in (10). Then the trie tells us that the next column-word xs must have the form 
C1C2C3C4C5 where a G {a ， e ， i ， o ， r ， u ， y} ， c 2 拿 {e ， h ， j ， k ， y ， z} ， c 3 G {e ， m ， o ， t }， 
C4 ^ {a ， b ， o}，and C5 G {a ， e ， i ， 0, u ， y}. (There are 221 such words.) 

Let an .. .a\ m be the trie nodes corresponding to the prefixes of the first 
l columns of a partial solution to the word rectangle problem. This auxiliary 
array enables Algorithm B to find all solutions, as explained in exercise 24. It 
turns out that there are exactly 625,415 valid 5x6 word rectangles, according 
to our conventions; and the method of exercise 24 needs about 19 teramems of 
computation to find them all. In fact, the profile of the search tree is 

(1, 5757, 2458830, 360728099, 579940198, 29621728, 625415), (13) 

indicating for example that just 360,728,099 of the 5757 3 = 190,804,533,093 

choices for x\X 2 X^ will lead to valid prefixes of 6-letter words. 

With care, exercise 24’s running time can be significantly decreased, once 
we realize that every node of the search tree for 1 < / < n requires testing 5757 
possibilities for x\ in step B3. If we build a more elaborate data structure for the 
5-letter words, so that it becomes easy to run though all words that have a specific 
letter in a specific position, we can refine the algorithm so that the average 
number of possibilities per level that need to be investigated becomes only 

(5757.0, 1697.9, 844.1, 273.5, 153.5, 100.8); (14) 

the total running time then drops to 1.15 teramems. Exercise 25 has the details. 
And exercise 28 discusses a method that’s faster yet. 

Commafree codes. Our next example deals entirely with four-letter words. 
But it’s not obscene; it’s an intriguing question of coding theory. The problem 
is to find a set of four-letter words that can be decoded even if we don’t put 
spaces or other delimiters between them. If we take any message that’s formed 
from words of the set by simply concatenating them together ， likethis, and 
if we look at any seven consecutive letters • •. X 1 X 2 XSX 4 X 5 XQX 7 …， exactly one 
of the four-letter substrings x\X 2 XsX 4 ^ X 2 X^X 4 X^^ xsX 4 X^xq^ x^x^xqXj will be a 
codeword. Equivalently, if X 1 X 2 XSX 4 and x^xqx^xs are codewords, then X 2 X^X 4 X^ 
and XSX 4 X 5 XQ and x^x^xqXj aren’t. (For example ， iket isn’t.) Such a set is 
called a “commafree code” or a “self-synchronizing block code” of length four. 

Commafree codes were introduced by F. H. C. Crick, J. S. Griffith, and 
L. E. Orgel [Proc. National Acad. Sci. 43 (1957), 416-421], and studied further 
by S. W. Golomb, B. Gordon, and L. R. Welch [Canadian Journal of Mathematics 
10 (1958) ， 202—209]，who considered the general case of m-letter alphabets and n- 
letter words. They constructed optimum commafree codes for all m when n = 2, 
3, 5, 7, 9 ， 11 ， 13, and 15; and optimum codes for all m were subsequently found 
also for n = 17, 19, 21，•… (see exercise 32). We will focus our attention on the 
four-letter case here (n = 4)，partly because that case is still very far from being 
resolved，but mostly because the task of finding such codes is especially instruc¬ 
tive. Indeed，our discussion will lead us naturally to an understanding of several 
significant techniques that are important for backtrack programming in general. 
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To begin，we can see immediately that a commafree codeword cannot be 
“periodic，” like dodo or gaga. Such a word already appears within two adjacent 
copies of itself. Thus we’re restricted to aperiodic words like item, of which there 
are m 4 — m 2 • Notice further that if item has been chosen, we aren’t allowed 
to include any of its cyclic shifts temi ， emit，or mite，because they all appear 
within itemitem. Hence the maximum number of codewords in our commafree 
code cannot exceed (m 4 — m 2 )/4. 

For example, consider the binary case，m = 2， when this maximum is 3. 
Can we choose three four-bit “words，” one from each of the cyclic classes 

[0001] = {0001，0010, 0100, 1000}, 

[0011] - {0011,0110,1100,1001}, (15) 
[0111] - {0111,1110,1101,1011}, 

so that the resulting code is commafree? Yes: One solution in this case is simply 
to choose the smallest word in each class, namely 0001 ， 0011， and 0111. (Alert 
readers will recall that we studied the smallest word in the cyclic class of any 
aperiodic string in Section 7.2.1. 1， where such words were called prime strings 
and where some of the remarkable properties of prime strings were proved.) 

That trick doesn’t work when m = 3, however, when there are (81 — 9)/4 = 
18 cyclic classes. Then we cannot include 1112 after we’ve chosen 0001 and 0011. 
Indeed，a code that contains 0001 and 1112 can’t contain either 0011 or 0111. 

We could systematically backtrack through 18 levels, choosing x\ in [0001] 
and X 2 in [0011] ， etc” and rejecting each a：/ as in Algorithm B whenever we 
discover that { 別， $2 , isn’t commafree. For example, if x\ — 0010 and 
we try X 2 = 1001， this approach would backtrack because x\ occurs inside X 2 X\. 

But a naive strategy of that kind，which recognizes failure only after a 
bad choice has been made, can be vastly improved. If we had been clever 
enough，we could have looked a little bit ahead，and never even considered the 
choice X 2 = 1001 in the first place. Indeed, after choosing x\ — 0010, we can 
automatically exclude all further words of the form *001，such as 2001 when 
m > 3 and 3001 when m > 4. 

Even better pruning occurs if, for example，weVe chosen x\ — 0001 and 
X 2 = 0011. Then we can immediately rule out all words of the forms 1 氺氺氺 or 
***0, because xil*** includes and includes x\. Already we could then 

deduce, in the case m > 3, that classes [0002], [0021], [0111], [0211], and [1112] 

must be represented by 0002, 0021 ， 0111 ， 0211， and 2111 ， respectively; each of 
the other three possibilities in those classes has been wiped out! 

Thus we see the desirability of a lookahead mechanism. 

Dynamic ordering of choices. Furthermore, we can see from this example 
that it’s not always good to choose then x‘ 2 , then $ 3 , and so on when trying 
to satisfy a general property ••- ^x n ) in the setting of Algorithm B. 

Maybe the search tree will be much smaller if we first choose , say，and then 
turn next to some other Xj, depending on the particular value of that was 
selected. Some orderings might have much better cutoff properties than others ， 
and every branch of the tree is free to choose its variables in any desired order. 
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Indeed，our commafree coding problem for ternary 4-tuples doesn’t dictate 
any particular ordering of the 18 classes that would be likely to keep the search 
tree small. Therefore, instead of calling those choices x‘ 2 , •… ， $ 18 , it’s better 
to identify them by the various class names, namely : roooi ， $ 0002 ， $ 0011 ， $ 0012 , 

$0021; $0022; $0102; $0111; $0112; $0121; $0122; $0211; $0212; $0221, $0222; $1112, 
$ 1122 , ^1222 - (Algorithm 7.2.1.IF is a good way to generate those names.) At 
every node of the search tree we then can choose a convenient variable on which 
to branch, based on previous choices. After beginning with xoooi 0001 at 
level 1 we might decide to try xoon ^ 0011 at level 2; and then，as we’ve seen ， 
the choices X 0002 0002 ， a ： oo 2 i 0021， ^om ^ 0111 ， X 0211 ^ 0211， and 

X 1112 ^r- 2111 are forced，so we should make them at levels 3 through 7. 

Furthermore, after those forced moves are made，it turns out that they don’t 
force any others. But only two choices for ^0012 will remain, while $0122 will have 
three. Therefore it will probably be wiser to branch on ^0012 rather than on ^0122 
at level 8. (Incidentally, it also turns out that there is no commafree code with 
xoooi = 0001 and ^ooii = 0011 ， except when m = 2.) 

It’s easy to adapt Algorithms B and W to allow dynamic ordering. Every 
node of the search tree can be given a “frame” in which we record the variable 
being set and the choice that was made. This choice of variable and value can 
be called a “move” made by the backtrack procedure. 

Dynamic ordering can be helpful also after backtracking has taken place. If 
we continue the example above, where a：oooi = 0001 and we’ve explored all cases 
in which ^ooii = 0011， we aren’t obliged to continue by trying another value 
for xoon. We do want to remember that 0011 should no longer be considered 
legal, until ^oooi changes; but we could decide to explore next a case such as 
^0002 = 2000 at level 2. In fact ， $ 0002 = 2000 is quickly seen to be impossible in 
the presence of 0001 (see exercise 34). An even more efficient choice at level 2, 
however，is X 0012 = 0012, because that branch immediately forces X 0002 = 0002, 
^0022 — 0022, ^oi 22 — 0122, ^0222 — 0222, ^1222 — 1222, and ^0011 — 1001. 

Sequential allocation re dux. The choice of a variable and value on which to 
branch is a delicate tradeoff. We don’t want to devote more time to planning 
than we’ll save by having a good plan. 

If we’re going to benefit from dynamic ordering, we’ll need efficient data 
structures that will lead to good decisions without much deliberation. On the 
other hand, elaborate data structures need to be updated whenever we branch 
to a new level，and they need to be downdated whenever we return from that 
level. Algorithm L illustrates an efficient mechanism based on linked lists; but 
sequentially allocated lists are often even more appealing, because they are cache- 
friendly and they involve fewer accesses to memory. 

Assume then that we wish to represent a set of items as an unordered 
sequential list. The list begins in a cell of memory pointed to by HEAD ， and 
TAIL points just beyond the end of the list. For example ， 
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is one way to represent the set { 1 ， 3 , 4 , 9 }. The number of items currently in the 
set is TAIL — HEAD; thus TAIL = HEAD if and only if the list is empty. If we wish 
to insert a new item knowing that x isn’t already present，we simply set 

MEM [TAIL] ^ x, TAIL 卜 TAIL + 1 . (17) 

Conversely, if HEAD < P < TAIL，we can easily delete MEM [P] : 

TAIL — TAIL - 1 ; if P ^ TAIL, set MEM[P] 4 - MEM [TAIL]. (18) 

(We’ve tacitly assumed in (17) that MEM [TAIL] is available for use whenever a 
new item is inserted. Otherwise we would have had to test for memory overflow.) 

We can、delete an item from a list without knowing its MEM location. Thus 
we will often want to maintain an “inverse list，” assuming that all items x lie in 
the range 0 < x < M. For example ， (16) becomes the following，if M = 10 : 


(! 9 ) 


(Shaded cells have undefined contents.) With this setup，insertion (17) becomes 

MEM [TAIL] x, MEM [IHEAD -h x] TAIL, TAIL — TAIL + 1 , (20) 

and TAIL will never exceed HEAD + M. Similarly, deletion of x becomes 

P 卜 MEM [IHEAD + $] , TAIL 4 - TAIL - 1 ; 

if P # TAIL, set y — MEM [TAIL], MEM [P ] 卜 y, MEM [IHEAD + y ] 卜 P. (21) 

For example, after deleting from ( 19 ) we would obtain this: 

( 22 ) 


In more elaborate situations we also want to test whether or not a given 
item x is present. If so, we can keep more information in the inverse list. 
A particularly useful variation arises when the list that begins at IHEAD contains 
a complete permutation of the values {HEAD，HEAD + 1 ，… ,HEAD + M — 1 }，and 
the memory cells beginning at HEAD contain the inverse permutation — although 
only the first TAIL — HEAD elements of that list are considered to be “active •” 
For example, in our commafree code problem with m = 3 , we can begin by 
putting items representing the M = 18 cycle classes [0001], [0002], [1222] 

into memory cells HEAD through HEAD + 17 . Initially they’re all active, with 
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TAIL 二 HEAD + 18 and MEM[IHEAD + c] 二 HEAD + c for 0 < c < 18. Then 
whenever we decide to choose a codeword for class c，we delete c from the active 
list by using a souped-up version of ( 21 ) that maintains full permutations: 

P 卜 MEM [IHEAD + c] , TAIL 卜 TAIL - 1; 

if P # TAIL, set y — MEM [TAIL], MEM [TAIL] ^ c, MEM[P] ^ y, 

MEM [IHEAD + c] — TAIL, MEM [IHEAD + y] — P. ( 23 ) 

Later on, after backtracking to a state where we once again want c to be consid¬ 
ered active, we simply set TAIL 卜 TAIL + 1， because c will already be in place! 

Lists for the commafree problem. The task of finding all four-letter comma- 
free codes is not difficult when m = 3 and only 18 cycle classes are involved. But 
it already becomes challenging when m = 4， because we must then deal with 
(4 4 — 4 2 )/4 = 60 classes. Therefore we 5 ll want to give it some careful thought as 
we try to set it up for backtracking. 

The example scenarios for m = 3 considered above suggest that we’ll repeat¬ 
edly want to know the answers to questions such as，“How many words of the 
form 02** are still available for selection as codewords?” Redundant data struc¬ 
tures, oriented to queries of that kind，appear to be needed. Fortunately，we shall 
see that there’s a nice way to provide them，using sequential lists as in ( 19 )—( 23 ). 

In Algorithm C below, each of the m 4 four-letter words is given one of three 
possible states during the search for commafree codes. A word is green if it’s part 
of the current set of tentative codewords. It is red if it’s not currently a candidate 
for such status, either because it is incompatible with the existing green words 
or because the algorithm has already examined all scenarios in which it is green 
in their presence. Every other word is blue, and sort of in limbo; the algorithm 
might or might not decide to make it red or green. All words are initially blue — 
except for the m 2 periodic words, which are permanently red. 

We’ll use the Greek letter a to stand for the integer value of a four-letter 
word x in radix m. For example, if m = 3 and if x is the word 0102， then 
a = ( 0102)3 = 11. The current state of word x is kept in MEM[a] ， using one of 
the arbitrary internal codes 2 (GREEN), 0 (RED), or 1 (BLUE). 

The most important feature of the algorithm is that every blue word x = 
X 1 X 2 XSX 4 is potentially present in seven different lists, called Pl(aO ， P 20 )， 
P30) ， SlO) ， S2(a0, S30)，and CLO)，where 

• PlO) ， P20) ， P3(x) are the blue words matching 別***， 

• SlO) ， S20) ， S3(x) are the blue words matching **$ 3 X 4 ， ^ 2 X 3 X 4 ] 

• CL(x) hosts the blue words in {$ 1 $ 2 $ 3 $ 4 ， $ 2 $ 3 $ 4 $ 1 ， $ 3 $ 4 $ 1 $ 2 , 

These seven lists begin respectively in MEM locations PlOFF+pi (a), P 20 FF+p 2 (cO, 
P30FF + p 3 (a), S 10 FF-f 5 i(a), S20FF + 5 2 (a), S30FF+ 5 3 (a), and CL0FF + 4d(a); 
here (P10FF, P20FF, P30FF, S10FF, S20FF, S30FF, CLOFF) are respectively (2m 4 , 
5m 4 ， 8 m 4 , 11m 4 ， 14m 4 ， 17m 4 , 20m 4 ). We define pi{{xiX 2 X^x^) m ) = 

P2((xiX 2 XsX4)m) = (x 1 X 2 )m, 仍 (($1$2 辦 4)m) = (x 1 X2X 3 ) m , 81 ((X 1 X 2 X 3 X 4 ) 771 ) = 
S 2 ((x 1 X 2 X 3 X 4 )m) = 0 ^ 3 $ 4 )m ， 的 ((^ 1 $ 2 $ 3 $ 4 ) 爪 ) =and finally 
cl((xiX 2 XsX 4 ) m ) is an internal number between 0 and (m 4 —m 2 )/4 — 1 assigned 
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Table 1 


LISTS USED BY ALGORITHM C (m 二 2)，ENTERING LEVEL 1 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

a 

b 

c 

d 

e 

f 

0 

RED 

BLUE 

BLUE 

BLUE 

RED 

RED 

BLUE 

BLUE 

RED 

BLUE 

RED 

BLUE 

BLUE 

BLUE 

BLUE 

RED 

10 


20 

21 

22 



23 

24 


29 


2c 

28 

2b 

2a 


20 

0001 

0010 

0011 

0110 

0111 




1100 

1001 

1110 

1101 

1011 




30 

25 








2d 








40 


50 

51 

52 



54 

55 


58 


59 

5c 

5e 

5d 


50 

0001 

0010 

0011 


0110 

0111 



1001 

1011 



1100 

1110 

1101 


60 

53 




56 




5a 




5f 




70 


80 

82 

83 



86 

87 


88 


8a 

8c 

8d 

8e 


80 

0001 


0010 

0011 



0110 

0111 

1001 


1011 


1100 

1101 

1110 


90 

81 


84 


84 


88 


89 


8b 


8e 


8f 


aO 


b8 

bO 

b9 



bl 

bb 


ba 


bd 

b2 

be 

b3 


bO 

0010 

0110 

1100 

1110 





0001 

0011 

1001 

0111 

1101 

1011 



cO 

b4 








be 








dO 


e4 

e8 

ec 



e9 

ed 


e5 


ee 

eO 

e6 

ea 


eO 

1100 




0001 

1001 

1101 


0010 

0110 

1110 


0011 

0111 

1011 


fO 

el 




el 




eb 




ef 




100 


112 

114 

116 



11c 

lie 


113 


117 

118 

11a 

lid 


110 



0001 

1001 

0010 


0011 

1011 

1100 


1101 


0110 

1110 

0111 


120 

110 


114 


115 


118 


119 


lib 


lie 


Ilf 


130 


140 

141 

144 



145 

148 


147 


14b 

146 

14a 

149 


140 

0001 

0010 



0011 

0110 

1100 

1001 

0111 

1110 

1101 

1011 





150 

142 




148 




14c 









PI 


P2 


P3 


SI 


S2 


S3 


CL 


This table shows MEM locations 0000 through 150f, using hexadecimal notation. (For 
example, MEM[40d] =5e; see exercise 36.) Blank entries are unused by the algorithm. 


reflection 

symmetry breaking 
closed 


to each class. The seven MEM locations where x appears in these seven lists are 
respectively kept in inverse lists that begin in MEM locations PI OFF — m 4 + a ， 
P20FF —m 4 + a ， CLOFF — m 4 + a. And the TAIL pointers，which indicate the 
current list sizes as in ( 19 )—( 23 )，are respectively kept in MEM locations P10FF + 
m 4 + a, P20FF + m 4 + a, , CLOFF + m 4 + a. (Whew; got that?) 

This vast apparatus, which occupies 22m 4 cells of MEM, is illustrated in 
Table 1， at the beginning of the computation for the case m = 2. Fortunately 
it’s not really as complicated as it may seem at first. Nor is it especially vast: 
After all, 22m 4 is only 13,750 when m = 5. 

(A close inspection of Table 1 reveals incidentally that the words 0100 and 
1000 have been colored red, not blue. That’s because we can assume without 
loss of generality that class [0001] is represented either by 0001 or by 0010. The 
other two cases are covered by left-right reflection of all codewords.) 

Algorithm C finds these lists invaluable when it is deciding where next to 
branch. But it has no further use for a list in which one of the items has become 
green. Therefore it declares such lists “closed ”； and it saves most of the work 
of list maintenance by updating only the lists that remain open. A closed list is 
represented internally by setting its TAIL pointer to HEAD — 1. 

For example, Table 2 shows how the lists in MEM will have changed just 
after x = 0010 has been chosen to be a tentative codeword. The elements 
{ 0001 , 0010 , 0011 , 0110 , 0111 } of ?l(x) are effectively hidden, because the tail 
pointer MEM [30] = If = 20 — 1 marks that list as closed. (Those list elements ac- 
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Table 2 

LISTS USED BY ALGORITHM C (m = 2)，ENTERING LEVEL 2 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

a 

b 

c 

d 

e 

f 


0 

RED 


GREEN 

BLUE 



BLUE 

BLUE 

I3S3 



BLUE 

BLUE 

BLUE 

BLUE 



10 









■ 



29 

28 

2b 

2a 



20 









1100 

1011 

1110 

1101 





PI 

30 

If 

















40 







54 

55 




58 

5c 

5e 

5d 



50 





0110 

0111 



1011 




1100 

1110 

1101 


P2 

60 

4f 




56 




59 




5f 





70 







86 

87 




8a 

8c 

8d 

8e 



80 







0110 

0111 



1011 


1100 

1101 

1110 


P3 

90 

80 


81 


84 


88 


88 


8b 


8e 


8f 



a0 




b9 




bb 




b8 


ba 




bO 










0011 

1101 

0111 





SI 

cO 

af 

















dO 













eO 

e4 




eO 

1100 












0011 

0111 

1011 


S2 

fO 

el 












ef 





100 




116 



11c 

lie 




117 

118 

11a 

lid 



110 







0011 

1011 

1100 


1101 


0110 

1110 

0111 


S3 

120 

110 


112 


113 


118 


119 


lib 


lie 


Ilf 



130 




144 



145 

148 




14b 

146 

14a 

149 



140 





0011 

0110 

1100 


0111 

1110 

1101 

1011 





CL 

150 

13f 




147 




14c 










The word 0010 has become green, thus closing its seven lists and making 0001 red. The 
logic of Algorithm C has also made 1001 red- Hence 0001 and 1001 have been deleted 
from the open lists in which they formerly appeared (see exercise 37)- 


undoing- 

Floyd 

compiler 


tually do still appear in MEM locations 200 through 204, just as they did in Table 1. 
But there’s no need to look at that list while any word of the form 0*** is green.) 

A general mechanism for doing and undoing. We’re almost ready to 
finalize the details of Algorithm C and to get on with the search for commafree 
codes，but a big problem still remains: The state of computation at every level 
of the search involves all of the marvelous lists that we’ve just specified, and 
those lists aren’t tiny. They occupy more than 5000 cells of MEM when m = 4, 
and they can change substantially from level to level. 

We could make a new copy of the entire state，whenever we advance to a 
new node of the search tree. But that’s a bad idea，because we don’t want to 
perform thousands of memory accesses per node. A much better strategy would 
be to stick with a single instance of MEM ， and to update and downdate the lists 
as the search progresses, if we could only think of a simple way to do that. 

And we’re in luck: There is such a way, first formulated by R. W. Floyd 
in his classic paper “Nondeterministic algorithms” [JACM 14 (1967), 636-644]. 
Floyd’s original idea, which required a special compiler to generate forward and 
backward versions of every program step, can in fact be greatly simplified when 
all of the changes in state are confined to a single MEM array. All we need to 
do is to replace every assignment operation of the form C MEM [a] ^ by the 
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slightly more cumbersome operation 

store(a ， r) : Set UNDO [w] ^ (a ， MEM[a])，MEM [a] ^r- and u ^ u + 1. ( 24 ) 

Here UNDO is a sequential stack that holds (address, value) pairs; in our appli¬ 
cation we could say ‘UNDO M 卜 （ a 《 16) + MEM [a] \ because the cell addresses 
and values never exceed 16 bits. Of course we’ll also need to check that the stack 
pointer u doesn’t get too large, if the number of assignments has no a priori limit. 

Later on，when we want to undo all changes to MEM since the time when u 
had reached a particular value uq ，we simply do this: 

unstore(uo) : While u > uq, set u u — 1^ 

(a ， v) UNDO M ， and MEM [a] 4 - v. ( 25 ) 

In our application the unstacking operation c (a^v) UNDO \_u ] 5 here could be 
implemented by saying c a <— UNDO M 》 16, v UNDO M & # ffff 

A useful refinement of this reversible-memory technique is often advanta¬ 
geous, based on the idea of “stamping” that is part of the folklore of program¬ 
ming. It puts only one item on the UNDO stack when the same memory address 
is updated more than once in the same round. 

store(a^) : If STAMP [a] / a, set STAMP [a] 

UNDO \_u] •<— (a, MEM [a]), and u i— u 1. 

Then set MEM [a] 4 - v. ( 26 ) 

Here STAMP is an array with one entry for each address in MEM. It’s initially 
all zero, and a is initially 1. Whenever we come to a fallback point, where 
the current stack pointer will be remembered as the value for some future 
undoing，we “bump” the current stamp by setting a ^ a + 1. Then ( 26 ) will 
continue to do the right thing. (In programs that run for a long time，we must 
be careful when integer overflow causes a to be bumped to zero; see exercise 38.) 

Notice that the combination of ( 24 ) and ( 25 ) will perform five memory 
accesses for each assignment and its undoing. The combination of ( 26 ) and ( 25 ) 
will cost seven mems for the first assignment to MEM [a], but only two mems 
for every subsequent assignment to the same address. So ( 26 ) wins, if multiple 
assignments exceed one-time-only assignments. 

Backtracking through commafree codes. OK ， we’re now equipped with 
enough basic knowhow to write a pretty good backtrack program for the problem 
of generating all commafree four-letter codes. 

Algorithm C below incorporates one more key idea, which is a lookahead 
mechanism that is specific to commafree backtracking; we’ll call it the “poison 
list •” Every item on the poison list is a pair，consisting of a suffix and a prefix 
that the commafree rule forbids from occurring together. Every green word 
xiX 2 X^x ^ — that is, every word that will be a final codeword in the current 
branch of our backtrack search — contributes three items to the poison list ， 
namely 

and (** 找 1 ， X 2 X^X^). ( 27 ) 
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If there’s a green word on both sides of a poison list entry, we’re dead: The 
commafree condition fails, and we must backtrack. If there’s a green word on 
one side but not the other，we can kill off all blue words on the other side by 
making them red. And if either side of a poison list entry corresponds to an 
empty list, we can remove this entry from the poison list because it will never 
affect the outcome. (Blue words become red or green，but red words stay red.) 

For example，consider the transition from Table 1 to Table 2. When word 
0010 becomes green, the poison list receives its first three items: 

(* 001 , 0 ***), (** 00 , 10 **), (*** 0 , 010 *). 

The first of these kills off the *001 list，because 0*** contains the green word 0010. 
That makes 1001 red. The last of these ， similarly，kills off the 010* list; but 
that list is empty when m = 2. The poison list now reduces to a single 
item ，（ ** 00 , 10 **)，which remains poisonous because list **00 contains the blue 
word 1100 and 10 ** contains the blue word 1011 . 

We’ll maintain the poison list at the end of MEM, following the CL lists. It 
obviously will contain at most 3(m 4 — m 2 )/4 entries，and in fact it usually turns 
out to be quite small. No inverse list is required; so we shall adopt the simple 
method of ( 17 ) and ( 18 )，but with two cells per entry so that TAIL will change 
by 士 2 instead of by 士 1. The value of TAIL will be stored in MEM at key times so 
that temporary changes to it can be undone. 

The case m = 4, in which each codeword consists of four quaternary digits 
{0, 1 ， 2, 3}，is particularly interesting，because an early backtrack program by Lee 
Laxdal found that no such commafree code can make use of all 60 of the cycle 
classes [0001], [0002], … ， [2333]. [See B. H. Jiggs, Canadian Journal of Math. 15 
(1963) ， 178-187.] Laxdal’s program also reportedly showed that at least three of 
those classes must be omitted; and it found several valid 57-word sets. Further 
details were never published, because the proof that 58 codewords are impossible 
depended on what Jiggs called a “quite time-consuming” computation. 

Because size 60 is impossible，our algorithm cannot simply assume that a 
move such as 1001 is forced when the other words 0011 ， 0110， 1100 of its class 
have been ruled out. We must also consider the possibility that class [0011] is 
entirely absent from the code. Such considerations add an interesting further 
twist to the problem, and Algorithm C describes one way to cope with it. 

Algorithm C (Four - letter commafree codes). Given an alphabet size m < 7 
and a goal g in the range L — m(m — 1) < ^ < L, where L = (m 4 — m 2 )/4, this 
algorithm finds all sets of g four-letter words that are commafree and include 
either 0001 or 0010. It uses an array MEM of M = [23.5m 4 」 16-bit numbers, as 
well as several more auxiliary arrays: ALF of size 16 3 m; STAMP of size M; X ， C ， 
S，and U of size L + 1; FREE and I FREE of size L; and a sufficiently large array 
called UNDO whose maximum size is difficult to guess. 

Cl. [Initialize.] Set ALF \_{abcd)io] 4 - (abcd) m for 0 < a^b^c^d < m. Set 
STAMP [fc] 0 for 0 < fc < M and a i- 0. Put the initial prefix, suffix, 
and class lists into MEM, as in Table 1. Also create an empty poison list by 
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setting MEM[PP] 4 - POISON, where POISON 二 22m 4 and PP 二 POISON - 1. 
Set FREE[fc ] 卜 IFREE [fc] fc for 0 < fc < L. Then set Z 卜 1， $ 卜 # 0001, 
c 卜 0， s 卜 L — p，/ 卜 L，u 卜 0， and go to step C3. (Variable l is the 
level，x is a trial word, c is its class, s is the “slack，” / is the number of free 
classes, and u is the size of the UNDO stack.) 

C2. [Enter level L] If l > L, visit the solution x\ .. .xl and go to C 6 . Otherwise 
choose a candidate word x and class c as described in exercise 39. 

C3. [Try the candidate.] Set U[/] u and a •<— +1. If 丨 < 0, go to C 6 if 5 = 0 

or Z = 1 ， otherwise set s ^ s — 1. If x > 0^ update the data structures to 
make x green, as described in exercise 40, escaping to C5 if trouble arises. 

C4. [Make the move.] Set X[/] •<— C [/] c, S[/] 5 , p IFREE [c], / -f- 

f-1. Ifp ^ /, set y FREE [/], FREE [p ] 卜 y, IFREE [y] 4 - p, FREE [/] 卜 
c, IFREE[c] /. (This is ( 23 ).) Then set /•<-/-(-1 and go to C2. 

C5. [Try again.] While u > U[/], set u i- u — 1 and MEM [UNDO Du] 》 16] t 
UNDO [u] & # f f f f. (Those operations restore the previous state, as in ( 25 ).) 
Then a a + 1 and redden x (see exercise 40). Go to C2. 

C 6 . [Backtrack.] Set l t l — 1 ， and terminate if / = 0. Otherwise set x ^ X[/], 
c l C [Z] ， / 4 - / — 1. If t < 0, repeat this step (class c was omitted from 
the code). Otherwise set s i- S[H and go back to C5. | 

Exercises 39 and 40 provide the instructive details that flesh out this skeleton. 

Algorithm C needs just 13, 177, and 2380 megamems to prove that no solu¬ 
tions exist for m = 4 when g is 60, 59, and 58. It needs about 22800 megamems 
to find the 1152 solutions for g = 57; see exercise 44. There are roughly (14, 
240 ， 3700 ， 38000) thousand nodes in the respective search trees, with most of 
the activity taking place on levels 30 士 10. The height of the UNDO stack never 
exceeds 2804, and the poison list never contains more than 12 entries at a time. 

Running time estimates. Backtrack programs are full of surprises. Sometimes 
they produce instant answers to a supposedly difficult problem. But sometimes 
they spin their wheels endlessly，trying to traverse an astronomically large search 
tree. And sometimes they deliver results just about as fast as we might expect. 

Fortunately，we needn’t sit in the dark. There’s a simple Monte Carlo algo¬ 
rithm by which we can often tell in advance whether or not a given backtrack 
strategy will be feasible. This method，based on random sampling, can actually 
be worked out by hand before writing a program, in order to help decide whether 
to invest further time while following a particular approach. In fact，the very act 
of carrying out this pleasant pencil-and-paper method often suggests useful cutoff 
strategies and/or data structures that will be valuable later when a program is 
being written. For example，the author developed Algorithm C above after first 
doing some armchair experiments with random choices of potential commafree 
codewords, and noticing that a family of lists such as those in Tables 1 and 2 
would be quite helpful when making further choices. 

To illustrate the method, let’s consider the n queens problem again, as rep¬ 
resented in Algorithm B* above. When n = 8 , we can obtain a decent “ballpark 


Running time 
estimates of run time— 
Monte Carlo algorithm 
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author 
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Fig. 69. Four random attempts to solve the 8 queens problem. Such experiments help 
to estimate the size of the backtrack tree in Fig. 68. The branching degrees are shown at 
the right of each diagram, while the random bits used for sampling appear below- Cells 
have been shaded in gray if they are attacked by one or more queens in earlier rows. 


estimate” of the size of Fig. 68 by examining only a few random paths in that 
search tree. We start by writing down the number D\ 4 - 8 , because there are 
eight ways to place the queen in row 1. (In other words, the root node of the 
search tree has degree 8 .) Then we use a source of random numbers — say the 
binary digits of 丌 mod 1 = (.001001000011 … ）2 — to select one of those place¬ 
ments. Eight choices are possible, so we look at three of those bits; we shall set 
Xl 卜 2 , because 001 is the second of the eight possibilities ( 000 , 001 ， … ， 111 ). 

Given Xi = 2, the queen in row 2 can’t go into columns 1 ， 2, or 3. Hence 
five possibilities remain for X 2 , and we write down ^ 5. The next three bits 
of 7 r lead us to set X 2 4 - 5, since 5 is the second of the available columns (4, 5, 6 , 
7, 8) and 001 is the second value of (000, 001 ， …， 100). If 7 r had continued with 
101 or 110 or 111 instead of 001 ， we would incidentally have used the “rejection 
method” of Section 3.4.1 and moved to the next three bits; see exercise 47. 

Continuing in this way leads to D 3 卜 4 ， X 3 4 - 1; then D 4 卜 3 ， X 4 4. 
(Here we used the two bits 00 to select and the next two bits 00 to select X 4 .) 
The remaining branches are forced: 卜 1 ， X 5 4- 7; Dq 卜 1 ， X 6 f 3; D 7 1 5 

X 7 卜 6 ; and we’re stuck when we reach level 8 and find D 8 ^ 0 . 

These sequential random choices are depicted in Fig. 69(a)，where we’ve 
used them to place each queen successively into an unshaded cell. Parts (b) ，（ c )， 
and (d) of Fig. 69 correspond in the same way to choices based on the binary 
digits of e mod 1， </> mod 1， and 7 mod 1. Exactly 10 bits of 丌 ， 20 bits of e，13 bits 
of </>， and 13 bits of 7 were used to generate these examples. 

In this discussion the notation stands for a branching degree, not for a 
domain of values. We’ve used uppercase letters for the numbers 
etc., because those quantities are random variables. Once weVe reached D[ = 0 
at some level ， we’re ready to estimate the overall cost, by implicitly assuming 
that the path we’ve taken is representative of all root-to-leaf paths in the tree. 

The cost of a backtrack program can be assessed by summing the individual 
amounts of time spent at each node of the search tree. Notice that every node on 
level l of that tree can be labeled uniquely by a sequence x\ ... which defines 
the path from the root to that node. Thus our goal is to estimate the sum of all 
c(x\ ... xi-i)^ where c(x\ ... xi-i) is the cost associated with node x\ ... xi-\. 
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For example, the four queens problem is represented by the search tree ( 4 )， 
and its cost is the sum of 17 individual costs 

c() + c ⑴ + c(13) + c(14) + c(142) + c(2) + c(24) + ••• + c(413) + c(42). ( 28 ) 

If C(xi ... xi) denotes the total cost of the subtree rooted at xi ... x/, then 

C(x\ ... xi) = c(xi ... xi) + C(x\ ... xix^-y) + … + C(x\ ... xix^-y) ( 29 ) 

when the choices for at node x\ .. .x\ are { 冗出 ，… ,For instance 
in ( 4 ) we have (7(1) = c ⑴ + (7(13) + (7(14); (7(13) = c(13); and CQ = c() + 
(7(1) -f (7(2) -h (7(3) + (7(4) is the overall cost ( 28 ). 

In these terms a Monte Carlo estimate for CQ is extremely easy to compute: 

Theorem E. Given X^ ， 认，义 2 ，… as above^ the cost of backtracking is 

C() = E(c() + DM^i) + ^ 2 (c(X!X 2 ) + D 3 (c(X 1 X 2 X s ) + • • •)))). ( 30 ) 

Proof. Node x\ •••$/，with branch degrees ^ d\ above it，is reached with 

probability 1 /di ... d/; so it contributes d\ ... d\c{x\ ... x\)jd\ ... di = c(x\ ... xi) 
to the expected value in this formula. | 

For example，the tree ( 4 ) has six root-to-leaf paths, and they occur with 
respective probabilities 1/8 ， 1/8 ， 1/4 ， 1/4 ， 1/8 ， 1/8. The first one contributes 
1/8 times c() + 4(c(l) + 2(c(13))), namely c ()/8 + c(l)/2 + c(13), to the expected 
value. The second contributes c ()/8 + c(l)/2 + c(14) + c(142); and so on. 

A special case of Theorem E，with all c(x\ • r;) = 1， tells us how to estimate 

the total size of the tree, which is often a crucial quantity: 

Corollary E. The number of nodes in the search tree ，given D 2 , …， is 

E(1 + D\ + D 1 D 2 + ...) = E(l + D\ (l + 1 ^ 2(1 + ^ 3(1 + ...)))). (3 1 ) 

For example ， Fig. 69 gives us four estimates for the size of the tree in Fig. 68 , 
using the numbers Dj at the right of each 8 x 8 diagram. The estimate from 

Fig. 69(a) is 1 + 8(1 + 5(1+ 4(1+ 3(1 +1(1 + 1(1 +1)))))) = 2129; and the other 
three are respectively 2689, 1489, 2609. None of them is extremely far from the 
true number ， 2057, although we can’t expect to be so lucky all the time. 

The detailed study in exercise 51 shows that the estimate ( 31 ) in the case 
of 8 queens turns out to be quite well behaved: 

(min 489, ave 2057, max 7409, dev V1146640 « 1071). ( 32 ) 

The analogous problem for 16 queens has a much less homogeneous search tree: 

(min 2597105, ave 1141190303, max 131048318769, dev ^ 12340(X)000). ( 33 ) 

Still，this standard deviation is roughly the same as the mean, so we’ll usually 
guess the correct order of magnitude. (For example, ten independent experiments 

predicted .632, . 866 , .237, 1.027, 4.006, .982, .143, .140, 3.402, and .510 billion 
nodes, respectively. The mean of these is 1.195.) A thousand trials with n = 64 
suggest that the problem of 64 queens will have about 3 x 10 6 ° nodes in its tree. 
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Let’s formulate this estimation procedure precisely, so that it can be per¬ 
formed conveniently by machine as well as by hand: 

Algorithm E (Estimated cost of backtrack). Given domains and properties 
Pi as in Algorithm B，together with node costs c(x\ .. .xi) as above, this algo¬ 
rithm computes the quantity S whose expected value is the total cost C() in ( 30 ). 
It uses an auxiliary array y\y 2 ... whose size should be > max(|L)i|, … ， \D n \). 

El. [Initialize.] Set l D i- 1^ S i- 0^ and initialize any data structures needed. 

E2. [Enter level l] (At this point Pi-\ (Xi,..., X/_i) holds.) Set S i- S 
D.c%.. .If l > n, terminate the algorithm. Otherwise set d ^ 0 
and set x min the smallest element of D\. 

E3. [Test x] If P/(Xi,...,holds，set yd x and d i- 1. 

E4. [Try again.] If x / maxD，，set x to the next larger element of D\ and return 
to step E3. 

E5. [Choose and try.] If d = 0, terminate. Otherwise set D D.d and X\ yi, 
where / is a uniformly random integer in {0, …， d — 1}. Update the data 
structures to facilitate testing 乃 +i，set / + 1, and go back to E2. | 

Although Algorithm E looks rather like Algorithm B，it never backtracks. 

Of course we can’t expect this algorithm to give decent estimates in cases 
where the backtrack tree is wildly erratic. The expected value of namely E 5, 
is indeed the true cost; but the probable values of S might be quite different. 

An extreme example of bad behavior occurs if property Pi is the simple con¬ 
dition L x\ > - > and all domains are {1 ， … ， n}. Then there’s only one solu¬ 
tion, x\ ... x n = n .. .1] and backtracking is a particularly stupid way to find it! 

The search tree for this somewhat ridiculous problem is, nevertheless, quite 
interesting. It is none other than the binomial tree T n of Eq. 7.2.1.3—( 21 )，which 
has ⑺ nodes on level l + 1 and 2 n nodes in total. If we set all costs to 1 ， 
the expected value of S is therefore 2 n = e n ln 2 . But exercise 50 proves that 
S will almost always be much smaller, less than e G ㈣ ) 2lnln ' Furthermore the 
average value of l when Algorithm E terminates with respect to T n is only 丑 n + l. 
When n = 100, for example, the probability that / > 20 on termination is only 
0.0000000027, while the vast majority of the nodes are near level 51. 

Many refinements of Algorithm E are possible. For example, exercise 52 
shows that the choices in step E5 need not be uniform. We shall discuss improved 
estimation techniques in Section 7.2.2.9, after having seen numerous examples 
of backtracking in practice. 

^Estimating the number of solutions. Sometimes we know that a problem 
has more solutions than we could ever hope to generate, yet we still want to 
know roughly how many there are. Algorithm E will tell us the approximate 
number, in cases where the backtrack process never reaches a dead end — that 
is, if it never terminates with d = 0 in step E5. There may be another criterion 
for successful termination in step E2 even though l might still be < n. The 
expected final value of D is exactly the total number of solutions，because every 
solution Xi .. .X[ constructed by the algorithm is obtained with probability 1/D. 
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For example, suppose we want to know the number of different paths by 
which a king can go from one corner of a chessboard to the opposite corner, 
without revisiting any square. One such path, chosen at random using the bits 
of 7r for guidance as we did in Fig. 69(a)，is shown here. Starting in the upper left 
corner, we have 3 choices for the first move. 

Then，after moving to the right，there are 
4 choices for the second move. And so on. 

We never make a move that would discon¬ 
nect us from the goal; in particular 5 two of 
the moves are actually forced. (Exercise 58 
explains one way to avoid fatal mistakes.) 

The probability of obtaining this partic¬ 
ular path is exactly \\\\\\\ ... | = \ 丨 D, 
where J D = 3x4x6x6x2x6x7x---x2 = 
l 2 • 2 4 • 3 4 • 4 10 • 5 9 • 6 6 • 7 1 » 8.7 x 10 20 . Thus 
we can reasonably guess, at least tentatively, 
that there are 10 21 such paths, more or less. 

Of course that guess, based on a single 
random sample, rests on very shaky grounds. 

But we know that the average value Mn = (D ⑴ H - h jN oi N guesses ， 

in N independent experiments，will almost surely approach the correct number. 

How large should N be, before we can have any confidence in the results? 
The actual values of D obtained from random king paths tend to vary all over 
the map. Figure 70 plots typical results, as N varies from 1 to 10000. For each 
value of N we can follow the advice of statistics textbooks and calculate the 
sample variance Vn = Sn/(N — 1) as in Eq. 4.2.2-(i6); then Mn ± ^/Vn/N is 
the textbook estimate. The top diagram in Fig. 70 shows these “error bars” in 
gray, surrounding black dots for Mn ，This sequence Mn does appear to settle 
down after N reaches 3000 or so, and to approach a value near 5 x 10 25 . That’s 
much higher than our first guess, but it has lots of evidence to back it up. 

On the other hand，the bottom chart in Fig. 70 shows the distribution of 
the logarithms of the 10000 values of D that were used to make the top chart. 
Almost half of those values were totally negligible — less than 10 20 . About 75% 
of them were less than 10 24 . But some of them* exceeded 10 28 . Can we really 
rely on a result that’s based on such chaotic behavior? Is it really right to throw 
away most of our data and to trust almost entirely on observations that were 
obtained from comparatively few rare events? 

Yes ， we’re okay! Some of the justification appears in exercise MPR— 124, 
which is based on theoretical work by P. Diaconis and S. Chatterjee. In the 
paper cited with that exercise, they defend a simple measure of quality ， 


Qn = max(D ⑴ ，…， D (Ar) )/(_/VM/v) 


D ⑴ + • • • + D( N ) 


(34) 
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* Four of the actual values that led to Fig. 70 were larger than 10 28 ; the largest, ^ 2.lx 10 28 , 
came from a path of length 57 - The smallest estimate, 19361664, came from a path of length 10. 
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Fig. 70. Estimates of the number of king paths, based on up to 10000 random trials. 

The middle graph shows the corresponding quality measures of Eq. (34) - The lower 
graph shows the logarithms of the individual estimates D^ k \ after they’ve been sorted. 

arguing that a reasonable policy in most experiments such as these is to stop 
sampling when Qn gets small. (Values of this statistic Qn have been plotted in 

the middle of Fig. 70.) 

Furthermore we can estimate other properties of the solutions to a backtrack 
problem, instead of merely counting those solutions. For example, the expected 
value of ID on termination of the random king’s path algorithm is the total 
length of such paths. The data underlying Fig. 70 suggests that this total is 
(2.66 土 .14) x 10 27 ; hence the average path length appears to be about 53. The 
samples also indicate that about 34% of the paths pass through the center; about 
46% touch the upper right corner; about 22% touch both corners; and about 7% 
pass through the center and both corners. 

For this particular problem we don’t actually need to rely on estimates, 
because the ZDD technology of Section 7.1.4 allows us to compute the true 
values. (See exercise 59.) The total number of simple corner-to-corner king paths 
on a chessboard is exactly 50,819,542,770,311,581,606,906,543; this value lies 
almost within the error bars of Fig. 70 for all N > 250, except for a brief interval 
near N = 1400. And the total length of all these paths turns out to be exactly 
2,700,911,171,651,251,701,712,099,831, which is a little higher than our estimate. 

The true average length is therefore « 53.15. The true probabilities of hitting the 
center, a given corner，both corners，and all three of those spots are respectively 

about 38.96%, 50.32%, 25.32%, and 9.86%. 

The total number of corner-to-corner king paths of the maximum length, 63 ， 

is 2,811,002,302,704,446,996,926. This is a number that can not be estimated 
well by a method such as Algorithm E without additional heuristics. 

The analogous problem for corner-to-corner knight paths, of any length, lies 
a bit beyond ZDD technology because many more ZDD nodes are needed. Using 
Algorithm E we can estimate that there are about (8.6 士 1.2) x 10 19 such paths. 
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Historical notes. The origins of backtrack programming are obscure. Equiva¬ 
lent ideas must have occurred to many people，yet there was hardly any reason to 
write them down until computers existed. We can be reasonably sure that James 
Bernoulli used such principles in the 17th century, when he successfully solved 
the “Tot tibi sunt dotes” problem that had eluded so many others (see Section 
7.2.1.7), because traces of the method exist in his exhaustive list of solutions. 

Backtrack programs typically traverse the tree of possibilities by using what 

is now called depth-first search，a general graph exploration procedure that 

/ _ 

Edouard Lucas credited to a student named Tremaux [Recreations Mathema- 
tiques 1 (Paris: Gauthier-Villars, 1882), 47-50]. 

The eight queens problem was first proposed by Max Bezzel [Schachzeitung 

3 (1848), 363; 4 (1849), 40] and by Franz Nauck [Illustrirte Zeitung 14,361 
(1 June 1850), 352; 15,377 (21 September 1850), 182], perhaps independently. 
C. F. Gauss saw the latter publication, and wrote several letters about it to 
his friend H. C. Schumacher. Gauss’s letter of 27 September 1850 is especially 
interesting, because it explained how to find all the solutions by backtracking —— 
which he called “Tatonniren’，from a French term meaning “to feel one’s way.” 
He also listed the lexicographically first solutions of each equivalence class under 
reflection and rotation: 15863724, 16837425, 24683175, 25713864, 25741863, 
26174835, 26831475, 27368514, 27581463, 35281746, 35841726, and 36258174. 

Computers arrived a hundred years later, and people began to use them 
for combinatorial problems. The time was therefore ripe for backtracking to 
be described as a general technique, and Robert J. Walker rose to the occasion 
[Proc. Symposia in Applied Math. 10 (1960), 91-94]. His brief note introduced 
Algorithm W in machine-oriented form, and mentioned that the procedure could 
readily be extended to find variable-length patterns x\ .. .x n where n is not fixed. 

The next milestone was a paper by Solomon W. Golomb and Leonard D. 
Baumert [JACM 12 (1965) ， 516—524]，who formulated the general problem care¬ 
fully and presented a variety of examples. In particular, they discussed the search 
for maximum commafree codes, and noted that backtracking can be used to find 
successively better and better solutions to combinatorial optimization problems. 
They introduced certain kinds of lookahead, as well as the important idea of 
dynamic ordering by branching on variables with the fewest remaining choices. 

Other noteworthy early discussions of backtrack programming appear in 
Mark 'Vells’s book Elements of Combinatorial Computing (1971)，Chapter 4; in 
a survey by J. R. Bitner and E. M. Reingold, CACM 18 (1975), 651-656; and 
in the Ph.D. thesis of John Gaschnig [Report CMU-CS-79-124 (Carnegie Mellon 
University ， 1979)，Chapter 4]. Gaschnig introduced techniques of “backmarking” 
and “backjumping” that we shall discuss later. 

Monte Carlo estimates of the cost of backtracking were first described briefly 
by M. Hall ， Jr.，and D. E. Knuth in Computers and Computing，AMM 72,2, 
part 2, Slaught Memorial Papers No. 10 (February 1965), 21-28. Knuth gave a 
much more detailed exposition a decade later, in Math. Comp. 29 (1975) ， 121- 
136. Such methods can be considered as special cases of so-called “importance 
sampling ”； see J. M. Hammersley and D. C. Handscomb ， Monte Carlo Methods 
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(London: Methuen, 1964) ， 57-59. Studies of random self-avoiding walks such 
as the king paths discussed above were inaugurated by M. N. Rosenbluth and 
A. W. Rosenbluth, J. Chemical Physics 23 (1955) ， 356-359. 
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EXERCISES 

► 1. [22] Explain how the tasks of generating (i) n - tuples ， (ii) permutations, (iii) com¬ 

binations, (iv) integer partitions ， (v) set partitions, and (vi) nested parentheses can 
all be regarded as special cases of backtrack programming, by presenting suitable 
domains and cutoff properties Pi(xi ^..., xi) that satisfy (i) and ( 2 ). 

2. [10] True or false: We can choose D\ so that Pi(xi) is always true- 

3. [16] Using a chessboard and eight coins to represent queens, one can follow the 
steps of Algorithm B and essentially traverse the tree of Fig. 68 by hand in about three 
hours. Invent a trick to save half of the work. 


► 4. [20] Reformulate Algorithm B as a recursive procedure called try (Z )， having global 
variables n and xi .. .x n ，to be invoked by saying L try{l)\ Can you imagine why the 
author of this book decided not to present the algorithm in such a recursive form? 

5. [20] Given r，with 1 < r < n, in how many ways can 7 non attacking queens be 
placed on an 8 x 8 chessboard, if no queen is placed in row r? 

6. [20] (T. B. Sprague, 1890.) Are there any values n > 5 for which the n queens 

problem has a “framed” solution with xi = 2, X 2 = n, = 1 , and x n = n — 11 


7. [20] Are there two 8-queen placements with the same 尤 1 尤 2 尤 3 尤 4 尤 5 尤 6 ? 

8. [21] Can a 4m-queen placement have 3m queens on “white” squares? 



9. [22] Adapt Algorithm W to the n queens problem, using bitwise operations on 
n-bit numbers as suggested in the text. 

10. [M25] (W. Ahrens, 1910.) Both solutions of the n queens prob¬ 
lem when n = 4 have chiral symmetry: Rotation by 90° leaves them 
unchanged, but reflection doesn’t. 

a) Can the n queens problem have a solution with reflection symmetry? 

b) Show that chiral symmetry is impossible when n mod 4 G {2, 3}. 

c) Sometimes the solution to an n queens problem contains four queens 
that form the corners of a tilted square, as shown here. Prove that we 
can always get another solution by tilting the square the other way (but 
leaving the other n — 4 queens in place) - 

d) Let C n be the number of chirally symmetric solutions，and suppose 
c n of them have Xk > k ior 1 < k < n/2. Prove that C n = 2*- n ^ 4 ^ c n . 

11. [M28] (Wraparound queens.) Replace ( 3 ) by the stronger conditions c Xj ^ Xk, 
{xk — Xj) modn ^ k — (xj — Xk) modn ^ k — j\ (The n x n grid becomes a torus-) 
Prove that the resulting problem is solvable if and only if n is not divisible by 2 or 3. 




12. [M30] For which n > 0 does the n queens problem have at least one solution? 

13. [M25] If exercise 11 has T(n) toroidal solutions，show that Q{mn) > Q{m) n T\n). 


14. [HM47] Does (In Q(n))/(nln n) approach a positive constant as n ^ 00 ? 

15. [21] Let H{n) be the number of ways that n queen bees can occupy 
an n x n honeycomb so that no two are in the same line- (For example, 
one of the H{A) = 7 ways is shown here.) Compute H{n) for small n. 

16. [15] J. H. Quick (a student) noticed that the loop in step L2 of Algorithm L can 
be changed from c while < CT to ^while x\ ^ 0^ because xi cannot be positive at 
that point of the algorithm. So he decided to eliminate the minus signs and just set 
xi-\-k+i A: in step L3- Was it a good idea? 
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17. [17] Suppose that n = 4 and Algorithm L has reached step L2 with l = 4 and 
X 1 X 2 X 3 = 241. What are the current values of ^ 43 : 5 ^ 6 ^ 7 ^ 8 , popip2P3P4^ and yiy 2 y^ 

19. [Ml9] What are the domains D\ in Langford’s problem ( 7 )? 

► 20. [21] Extend Algorithm L so that it forces xi <— k whenever k ^ {x \^..., 

► 21. [M25] If x = X 1 X 2 . - - X 2 n^ let x D = (—X 2 n ) - - - (—X 2 )(—xi) = —x R be its dual. 

a) Show that if n is odd and x solves Langford’s problem ( 7 )，we have Xk = n for 
some k < [n/ 2 」if and only if = n for some k > |_ n / 2 」. 

b) Find a similar rule that distinguishes x from x D when n is even. 

c) Consequently the algorithm of exercise 20 can be modified so that exactly one of 
each dual pair of solutions {x^x D } is visited- 

22. [M26] Explore “loose Langford pairs ”： Replace ‘j + A: + 1’ in ( 7 ) by ‘j + [3k/2\\ 

23. [17] We can often obtain one word rectangle from another by changing only a 
letter or two. Can you think of any 5x6 rectangles that almost match ( 10 )? 

24. [20] Customize Algorithm B so that it will find all 5 x 6 word rectangles. 

► 25. [25] Explain how to use orthogonal lists, as in Fig. 13 of Section 2.2.6, so that it’s 
easy to visit all 5-letter words whose kth character is c, given 1 < A: < 5 and a < c < z. 
Use those sublists to speed up the algorithm of exercise 24. 

26. [21] Can you find nice word rectangles of sizes 5 x 7, 5 x 8 , 5 x 9, 5 x 10? 

27. [22] What profile and average node costs replace ( 13 ) and ( 14 ) when we ask the 
algorithm of exercise 25 for 6 x 5 word rectangles instead of 5 x 6 ? 

► 28. [23] The method of exercises 24 and 25 does n levels of backtracking to fill the 
cells of an m x n rectangle one column at a time，using a trie to detect illegal prefixes 
in the rows. Devise a method that does mn levels of backtracking and fills just one 
cell per level, using tries for both rows and columns. 

29. [15] What’s the largest commafree subset of the following words? 

aced babe bade bead beef cafe cede dada dead deaf face fade feed 

► 30. [22] Let wi^ W 2 , … ， w m be four-letter words on an m-letter alphabet- Design an 
algorithm that accepts or rejects each Wj, according as Wj is commafree or not with 
respect to the accepted words of { 軌，…， 

31. [M22] A two - letter block code on an m-letter alphabet can be represented as a 
digraph D on m vertices, with a —>• 6 if and only if ab is a codeword. 

a) Prove that the code is commafree D has no oriented paths of length 3- 

b) How many arcs can be in a digraph with no oriented paths of length r? 

► 32. [M30] (W. L. Eastman ， 1965-) The following elegant construction yields a comma- 
free code of maximum size for any odd block length n，over any alphabet. Given a 
sequence of x = xoxi ... x n -i of nonnegative integers, where x differs from each of its 
other cyclic shifts Xk … x n -iXo... Xk-i for 0 < A: < n, the procedure outputs a cyclic 
shift ax with the property that the set of all such ax is commafree. 

We regard x as an infinite periodic sequence (x n ) with Xk = Xk—n for all k > n. 
Each cyclic shift then has the form XkXk+i . - - Xk+n-i^ The simplest nontrivial example 
occurs when n = 3， where x = X 0 X 1 X 2 X 0 X 1 X 2 X 0 - - - and we don’t have xo = xi = 

In this case the algorithm outputs XkXk+iXk +2 where Xk > Xk+i < Xk+ 2 ] and the set 
of all such triples clearly satisfies the commafree condition. 
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One key idea is to think of x as partitioned into t substrings by boundary mark¬ 
ers bj , where 0 < 6o < 6i < • • • < bt 一 i < n and bj = bj 一 t + n for j > t. Then substring 

Vj IS ^bj +i • • • 工 6j + 

and bj = j for all j; ultimately t = 1， and ax = yo is the desired output. 

Eastman’s algorithm is based on comparison of adjacent substrings yj-i and yj • 
If those substrings have the same length, we use lexicographic comparison; otherwise 
we declare that the longer substring is bigger. 

The second key idea is the notion of “dips，” which are substrings of the form 
z = z\ ... Zk where k > 2 and > • • • > Zk-i < It’s easy to see that any string 
y = yoyi •… in which we have yi < yi+i for infinitely many i can be factored into a 
sequence of dips, y = ••” and this factorization is unique- For example, 

3141592653589793238462643383 . . . = 314 15 926 535 89 79 323 846 26 4338 3.... 

Furthermore, if y is a periodic sequence, its factorization into dips is also ultimately 
periodic, although some of the initial factors may not occur in the period. For example ， 

123443550123443550123443550 ... = 12 34 435 501 23 4435 501 23 4435 .... 


The number t of substrings is always odd. Initially t = n 


Given a periodic, nonconstant sequence y described by boundary markers as above ， 
where the period length t is odd, its periodic factorization will contain an odd number 
of odd-length dips. Each round of Eastman’s algorithm simply retains the boundary 
points at the left of those odd-length dips. Then t is reset to the number of retained 
boundary points, and another round begins if t > 1 - 

a) Play through the algorithm by hand when n = 19 and x = 3141592653589793238- 

b) Show that the number of rounds is at most |_log 3 几 」 • 

c) Exhibit a binary x that achieves this worst-case bound when n = 3 e • 

d) Implement the algorithm with full details. (It’s surprisingly short!) 

e) Explain why the algorithm yields a commafree code. 

33. [HM28] What is the probability that Eastman’s algorithm finishes in one round? 
(Assume that x is a random m-ary string of odd length n > 1， unequal to any of its 
other cyclic shifts- Use a generating function to express the answer.) 

34. [18] Why can’t a commafree code of length (m 4 — m 2 )/4 contain 0001 and 2000? 

► 35. [15] Why do you think sequential data structures such as ( 16 )—( 23 ) weren’t fea¬ 
tured in Section 2.2.2 of this series of books (entitled “Sequential Allocation ”）？ 

36. [17] What’s the significance of (a) MEM [40d] =5e and (b) MEM [904] =84 in Table 1? 

37. [18] Why is (a) MEM[f 8 ] = e7 and (b) MEM[aOd] = ba in Table 2? 

38. [20] Suppose you’re using the undoing scheme ( 26 ) and the operation a l + 1 
has just bumped the current stamp a to zero- What should you do? 

► 39• [25] Spell out the low-level implementation details of the candidate selection 
process in step C2 of Algorithm C. Use the routine store(a, v) of ( 26 ) whenever changing 
the contents of MEM, and use the following selection strategy: 

a) Find a class c with the least number r of blue words. 

b) If r = 0, set x < - 1; otherwise set x to a word in class c. 

c) If r > 1， use the poison list to find an x that maximizes the number of blue words 

that could be killed on the other side of the prefix or suffix list that contains x. 

► 40• [28] Continuing exercise 39, spell out the details of step C3 when x > 0. 

a) What updates should be done to MEM when a blue word x becomes red? 

b) What updates should be done to MEM when a blue word x becomes green? 
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c) Step C3 begins its job by making x green as in part (b). Explain how it should 
finish its job by updating the poison list. 

42. [M30] Is there a binary (m = 2) commafree code with one codeword in each of 
the (Y, d \n^ d ) 2U/d )l n c y cle classes，for every word length n? 

44. [HM29] A commafree code on m letters is equivalent to 2m! such codes if we 
permute the letters and/or replace each codeword by its left-right reflection. 

Determine all of the nonisomorphic commafree codes of length 4 on m letters when 
m is (a) 2 (b) 3 (c) 4 and there are (a) 3 (b) 18 (c) 57 codewords. 

45. [M42] Find a maximum-size commafree code of length 4 on m = 5 letters. 

47. [20] Explain how the choices in Fig. 69 were determined from the “random” bits 
that are displayed. For instance, why was X 2 set to 1 in Fig. 69(b)? 

48. [Ml5] Interpret the value E(Di ... D/), in the text’s Monte Carlo algorithm. 

49. [M22] What’s a simple martingale that corresponds to Theorem E? 

► 50. [HM25] Elmo uses Algorithm E with = {1，• • • ， n} ， P/ = [xi > >〜]，c = 1. 

a) Alice flips n coins independently, where coin k yields “heads” with probability 1/k. 
True or false: She obtains exactly l heads with probability [^]/n!. 

b) Let Yi, I 2 , •. •, Y/ be the numbers on the coins that come up heads. (Thus Yi = 1 ， 
and Y 2 = 2 with probability 1/2.) Show that Pr(Alice obtains Yi ， ¥ 2 ^ … ， Y；)= 
Pr(Elmo obtains Xi = Yi^ X 2 = M 一 1 , …， Xi = Y\). 

c) Prove that Alice q.s. obtains at most (In n) (In In n) heads. 

d) Consequently Elmo’s 6" is q.s. less than exp((In n) 2 (In In n)). 

► 51. [M30] Extend Algorithm B so that it also computes the minimum, maximum, 
mean, and variance of the Monte Carlo estimates S produced by Algorithm E. 

52. [M21] Instead of choosing each yi in step E5 with probability 1/d，we could use 
a biased distribution where Pr (/ = i \ Xi^... ,X/_i) = px 1 ...x l _ 1 (yi) > 0. How should 
the estimate S be modified so that its expected value in this general scheme is still (7()? 

53. [M20] If all costs c(xi, •… are positive，show that the biased probabilities of 
exercise 52 can be chosen in such a way that the estimate S is always exact. 

► 55. [M25] The commafree code search procedure in Algorithm C doesn’t actually 
fit the mold of Algorithm E, because it incorporates lookahead, dynamic ordering ， 
reversible memory，and other enhancements to the basic backtrack paradigms. How 
could its running time be reliably estimated with Monte Carlo methods? 

57. [M20] Algorithm E can potentially follow M different paths Xi ... X/_i before it 
terminates, where M is the number of leaves of the backtrack tree. Suppose the final 
values of D at those leaves are D^ l \ …, Z)( M ). Prove that ( 乃 ⑴ ...> M. 

58. [27] The text’s king path problem is a special case of the general problem of 
counting simple paths from vertex s to vertex t in a given graph. 

We can generate such paths by random walks from s that don’t get stuck, if we 
maintain a table of values DIST (v) for all vertices v not yet in the path, representing 
the shortest distance from v to t through unused vertices- For with such a table we 
can simply move at each step to a vertex for which DIST (v) < 00 . 

Devise a way to update the DIST table dynamically without unnecessary work. 

59. [26] A ZDD with 3 ， 174，197 nodes can be constructed for the family of all simple 
corner - to-corner king paths on a chessboard, using the method of exercise 7.1.4-225- 
Explain how to use this ZDD to compute (a) the total length of all paths; (b) the 
number of paths that touch any given subset of the center and/or corner points. 
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► 60. [20] Experiment with biased random walks (see exercise 52)，weighting each non¬ 
dead-end king move to a new vertex v by 1 + DIST (v) 2 instead of choosing every such 
move with the same probability. Does this strategy improve on Fig. 70? 

Table 666 

TWENTY QUESTIONS (SEE EXERCISE 90) 

1- The first question whose answer is A is: 

(A) 1 (B) 2 (C) 3 (D) 4 (E) 5 

2. The next question with the same answer as this one is: 

(A) 4 (B) 6 (C) 8 (D) 10 (E) 12 

3 - The only two consecutive questions with identical answers are questions: 

(A) 15 and 16 (B) 16 and 17 (C) 17 and 18 (D) 18 and 19 (E) 19 and 20 

4. The answer to this question is the same as the answers to questions: 

(A) 10 and 13 (B) 14 and 16 (C) 7 and 20 (D) 1 and 15 (E) 8 and 12 

5 - The answer to question 14 is: 

(A) B (B) E (C) C (D) A (E) D 

6 - The answer to this question is: 

(A) A (B) B (C) C (D) D (E) none of those 

7. An answer that appears most often is: 

(A) A (B) B (C) C (D) D (E) E 

8 - Ignoring answers that appear equally often, the least common answer is: 

(A) A (B) B (C) C (D) D (E) E 

9 - The sum of all question numbers whose answers are correct and the same as this one is: 

(A) G [59 • • 62] (B) G [52 • • 55] (C) G [44 • • 49] (D) G [61 • • 67] (E) G [44 • • 53] 

10- The answer to question 17 is: 

(A) D (B) B (C) A (D) E (E) wrong 

11- The number of questions whose answer is D is: 

(A) 2 (B) 3 (C) 4 (D) 5 (E) 6 

12. The number of other questions with the same answer as this one is the same as the number 
of questions with answer: 

(A) B (B) C (C) D (D) E (E) none of those 

13 - The number of questions whose answer is E is: 

(A) 5 (B) 4 (C) 3 (D) 2 (E) 1 

14. No answer appears exactly this many times: 

(A) 2 (B) 3 (C) 4 (D) 5 (E) none of those 

15 - The set of odd-numbered questions with answer A is: 

(A) {7} (B) {9} (C) not {11} (D) {13} (E) {15} 

16 - The answer to question 8 is the same as the answer to question: 

(A) 3 (B) 2 (C) 13 (D) 18 (E) 20 

17 - The answer to question 10 is: 

(A) C (B) D (C) B (D) A (E) correct 

18 - The number of prime-numbered questions whose answers are vowels is: 

(A) prime (B) square (C) odd (D) even (E) zero 

19 - The last question whose answer is B is: 

(A) 14 (B) 15 (C) 16 (D) 17 (E) 18 

20- The maximum score that can be achieved on this test is: 

(A) 18 (B) 19 (C) 20 (D) indeterminate 

(E) achievable only by getting this question wrong 


► 90. [M29] (Donald R. Woods ， 2000.) Find all ways to maximize the number of correct 
answers to the questionnaire in Table 666 - Each question must be answered with a 
letter from A to E. Hint: Begin by clarifying the exact meaning of this exercise. What 
answers are best for the following two-question, two-letter “warmup problem ”？ 

1- (A) Answer 2 is B. (B) Answer 1 is A. 
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2. (A) Answer 1 is correct. (B) Either answer 2 is wrong or answer 1 is A, but not both. 

91. [HM28] Show that exercise 90 has a surprising, somewhat paradoxical answer if 
two changes are made to Table 666: 9(E) becomes C G [39. .43]’; 15(C) becomes ‘{11}’. 

95. [HM26] Let P n be the number of integer sequences xi ... x n such that xi = 1 and 
1 < Xk+i < 2xk for 1 < k < n. (The first few values are 1 ， 2 ， 6 ， 26 ， 166 ， 1626 ，…； 
this sequence was introduced by A. Cayley in Philosophical Magazine (4) 13 (1857), 
245—248, who showed that P n enumerates the partitions of 2 n — 1 into powers of 2.) 

a) Show that P n is the number of different profiles that are possible for a binary tree 
of height n. 

b) Find an efficient way to compute P n for large n. Hint: Consider the more general 
sequence Pn m \ defined similarly but with x\ = m. 

c) Use the estimation procedure of Theorem E to prove that P n > 2“)/(n — 1)!. 

999 • [MOO] this is a temporary exercise (for dummies) 
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SECTION 7.2.2 

1. Although many formulations are possible，the following may be the nicest: (i) 

is arbitrary (but hopefully finite)，and Pi is always true, (ii) = {1 ，2 , •… ， n} and 
Pi = c Xj ^ Xk for 1 < j < k < V. (iii) For combinations of n things from N, 
D/e = {1， …， TV + 1 — A:} and Pi = c xi < • — < xi . (iv) = {0, 1 ， ••” L n /&」}; 
Pi = c xi > • • • > x/ and n — (n — l)xi < xi + • • • + x/ < n\ (v) For restricted growth 
strings ，Dk = {0, • • • ， A: — 1} and Pi = ^Xj+i < 1 + max(xi，• • •, Xj) for 1 < j < l\ 
(vi) For the indices of left parentheses (see 7.2.1.6—( 8 ))， = {1 ， … ， 2A: — 1} and 
Pi = < . . . < Xi\ 

2. True. (If not，set Di l Di fl | Pi (a:)}.) 

3. We can restrict Di to {1 ， 2, 3, 4}，because the reflection (9—xi)... (9—xg) of every 
solution xi .. .X8is also a solution. (H. C- Schumacher made this observation in a letter 
to Gauss，24 September 1850.) Notice that Fig. 68 is left-right symmetric. 

4. try{l) = “If Z > n，visit xi ... x n . Otherwise, for xi <— min min D[ + 1, • • • ， 
maxD/, if Pi(xi ， … ,xi) call try (l + 1 )?’ 

This formulation is elegant, and fine for simple problems. But it doesn’t give any 
clue about why the method is called “backtrack” ！ Nor does it yield efficient code for 
important problems whose inner loop is performed billions of times- We will see that 
the key to efficient backtracking is to provide good ways to update and downdate the 
data structures that speed up the testing of property Pi. The overhead of recursion can 
get in the way, and the actual iterative structure of Algorithm B isn’t difficult to grasp. 

5. Excluding cases with j = r or k = r from ( 3 ) yields respectively (312 ， 396 ， 430, 
458, 458, 430, 396, 312) solutions. (With column r also omitted there are just (40, 46, 
42, 80, 80, 42, 46, 40).) 

6 . Yes, probably for all n > 16. One such is X 1 X 2 - - . X 17 = 2 17 12 10 7 14 3 5 9 13 
154 11 8 6 1 16. [See Proc. Edinburgh Math. Soc. 8 (1890), 43 and Fig. 52.] 

7. Yes: (42736815,42736851); also therefore (57263148,57263184). 

8 . Yes, at least when m = 4; e.g., x\ - . - = 5 8 13 16 3 7 15 11 6 2 10 14 1 4 

9 12. There are no solutions when m = 5， but 7 10 13 20 17 24 3 6 23 11 16 21 4 9 
14 2 19 22 1 8 5 12 15 18 works for m = 6 . (Are there solutions for all even m > 4? 
C. F. de Jaenisch ，Traite des applications de ranalyse mathematique au jeu des echecs 
2 (1862), 132—133, noted that all 8 -queen solutions have four of each color. He proved 
that the number of white queens must be even，because {xk + k) is even.) 

9. Let bit vectors ai,bi, ci represent the “useful” elements of the sets in ( 6 )，with a/ = 

I x e Ai}, bi = G ^ fl [1.. n]}, c/ = ^{2 X ~ 1 | z G (7, fl [1 … n]}. 

Then step W2 sets 5 / a/ & 6 / & c/, where \i is the mask 2 n — 1. 

In step W3 we can set t ^ si h (—si\ai^r- a/_i + t, bi l ( 6 /_i + 尤）》 1 ， 
ci *<— ((c/_i + 尤 ) 《 1 ) and it’s also convenient to set si 5 / — t at this time，instead 
of deferring this change to step W4. 

(There’s no need to store xi in memory，or even to compute xi in step W3 as an 
integer in [1 • • n]，because xi can be deduced from ai — a/_i when a solution is found.) 

10. (a) Only when n = 1, because reflected queens can capture each other. 

(b) Queens not in the center must appear in groups of four. 

(c) The four queens occupy the same rows, columns，and diagonals in both cases- 

(d) In each solution counted by c n we can independently tilt (or not) each of the 
Ln/4 」 groups of four. [Mathematische Unterhaltungen und Spiele 1, second edition 
(Leipzig: Teubner, 1910), 249-258.] 
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11* Suppose the Xk are distinct. Then ^2^ =1 (xk + A:) = 2( n j" 1 ) = 0 (modulo n). If the 
numbers (xk + k) mod n are also distinct，we have also H =1 k 三 ( n J 1 ). But that is 
impossible when n is even. 

Now suppose further that the numbers {xk — k) mod n are distinct- Then we 

have Y ， k=i( x k + k) 2 = - k) 2 = Y^=i — n ( n + !)( 2n + 1)/6. And we 

also have ^2^=i( Xk + k) 2 + ( 尤 — k) 2 = 4n(n + l)(2n + 1)/6 = 2n/3，which is 

impossible when n is a multiple of 3. [See W. Ahrens ，Mathematische Unterhaltungen 
und Spiele 2, second edition (1918) ， 364—366，where G- Polya cites a more general 
result of A* Hurwitz that applies to wraparound diagonals of other slopes.] 

Conversely, if n isn’t divisible by 2 or 3， we can let x n = n and Xk = (2A:) modn 
for 1 < A: < n- (The rule Xk = (3A:) mod n also works. See Edouard Lucas, Recreations 
Mathematiques 1 (1882) ， 84-86.) 

12. The (n + 1) queens problem clearly has a solution with a queen in a corner if and 
only if the n queens problem has a solution with a queen-free main diagonal- Hence by 
the previous answer there’s always a solution when n mod 6 G {0 ， 1,4,5}. 

Another nice solution was found by J. Franel [L 5 Intermediaire des Mathematiciens 
1 (1894) ， 140-141] when n mod 6 G {2,4}: Let Xk = (n/2 + 2k — 3[2/c < n]) mod n + 1 ， 
for 1 < A: < n- With this setup we find that Xk — Xj = 士 (k — j) and 1 < j < A: < n 
implies (1 or 3) (A: — j) + (0 or 3) = 0 (modulo n); hence k — j = n — (1 or 3) • But the 
values of xi, ^ 2 , $ 3 , x n _ 2 , 尤 n— 1 ， x n give no attacking queens except when n = 2. 

FranePs solution has empty diagonals，so it provides solutions also for n mod 6 G 
{3, 5}. We conclude that only n = 2 and n = 3 are impossible- 

[A more complicated construction for all n > 3 had been given earlier by E. Pauls ， 
in Deutsche Schachzeitung 29 (1874) ， 129—134 ， 257-267 - Pauls also explained how to 
find all solutions，in principle, by building the tree level by level (not backtracking).] 

13. For 1 < j < n, let x[^ ... be a solution for m queens，and let yi … y n be a 

solution for n toroidal queens. Then — l)n + yj (for 1 < i < m and 

1 < j < n) is a solution for mn queens. [L Rivin, L Vardi, and P- Zimmermann, AMM 

101 (1994), 629-639, Theorem 2.] 

14. [Rivin ， Vardi，and Zimmermann, in the paper just cited, observe that in fact the 
sequence (lnQ(n))/(nln n) appears to be increasing.] 

15. Let the queen in row k be in cell k. Then we have a “relaxation” of the n queens 
problem, with \xk — Xj\ becoming just Xk — Xj in ( 3 ); so we can ignore the b vector in 
Algorithm B* or in exercise 9. We get 

n = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 

H(n) = 1 1 1 3 7 23 83 405 2113 12657 82297 596483 4698655 40071743 367854835 

[N. JL Cavenagh and L M. Wanless ， Discr. Appl. Math. 158 (2010) ， 136—146, Table 2.] 

16. It fails spectacularly in step L5. The minus signs, which mark decisions that were 
previously forced, are crucial tags for backtracking. 

17. X 4 • • • xs = 21040, po • • • P 4 = 33300， and yiy 2 ys = 130. (If Xi < 0 the algorithm 
will never look at 队 ； hence the current state of ?/4 •. • ?/s is irrelevant- But happens 
to be 20 , because of past history; y 7 , and ys haven’t yet been touched.) 

19. We could say Di is {—n ， … ， 一 2, 一 1 ， 1 ， 2, • • • ， n}，or {A: | A: — 0 and 2 — l < k < 
2n — l — 1}, or anything in between. (But this observation isn’t very useful.) 
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20. First we add a Boolean array ai • • .a n , where a/c means u k has appeared,” as in 
Algorithm B*. It’s 0 … 0 in step LI; we set a/c l 1 in step L3, a/c ^ 0 in step L5- 

The loop in step L2 becomes “while < 0, go to L5 if / > n — 1 and a 2 n -i-i = 0, 
otherwise set / + 1.” After finding / + A: + 1 < 2n in L3, and before testing xi+k+i 

for 0, insert this: a If l > n—1 and a* 2 n-/-i = 0, while l + k + 1 ^ 2n set j ^ k ^r- pk.” 

21. (a) In any solution Xk = n 4=^ Xk+ n +i — —n = n. 

(b) Xk = n — 1 for some k < n/2 if and only if = n — 1 for some k > n/2. 

(c) Let n = n — [n is even]. Change c / > n — 1 and a 2 n-i-i = 0’ in the modified 
step L2 to c (/ = L n /2」and a n / = 0) or (l > n — 1 and a 2 n -i — i = 0)\ Insert the following 
before the other insertion into step L3: “If Z = Ln/2」and a n ，= 0 ， while k ♦ n! set 
j k pk^ And in step L5 — this subtle detail is needed when n is even — go to 
L5 instead of L4 if / = [_ 几 /2」and k = n . 

22. The solutions 11 and 2112 for n = 1 and n = 2 are self-dual; the solutions for n = 4 
and n = 5 are 43112342, 2452311435, 4511234253, and their duals. The total number 
of solutions for n = 1, 2, ... is 1 ， 1, 0, 2, 4, 20, 0, 156, 516, 2008, 0, 52536, 297800, 
1767792, 0, 75678864, …； there are none when n mod 4 = 3, by a parity argument. 

Algorithm L needs only obvious changes. To compute solutions by a streamlined 
method like exercise 21， use n = n — (0, 1 ， 2, 0) and substitute c l = [n/4」+ (0, 1 ， 2, 1)’ 
for c l = [n/2\\ when n mod4 = (0 ， 1 ， 2,3); also replace > n — 1 and a‘ 2 n - 卜 i = 0’ 
by c l > [n/2] and aL( 4 n + 2 _ 2 /)/ 3 」= O' The case n = 15 is proved impossible with 397 
million nodes and 9.93 gigamems. 

23. slums sluff ， slump, slurs, slurp, or sluts; (slums, total) (slams, tonal). 

24. Build the list of 5-letter words and the trie of 6 - letter words in step Bl; also set 
aoiao‘2ao3ao4ao5 00000. Use min Di = 1 in step B2 and maxD/ = 5757 in step B4. 
Testing Pi in step B3，if word x 3 is C 1 C 2 C 3 C 4 C 5 , consists of forming an • • • ai 5 , where 
aik = tne[a (/ _!) ； .,c/c] for 1 < A: < 5; but jump to B4 if any aik is zero. 

25. There are 5 x 26 singly linked lists, accessed from pointers hk c ^ all initially zero. 
The xth. word c x iCx2Cx3C x 4Cx5^ for 1 < x < 5757, belongs to 5 lists and has five pointers 

Ixi^x2lx3lx4 ： lx5• To insert it 5 set Ixk 人 ~ ^kc x ^ ^ h / kc x j i , 人 ~ $， and si ~ + 1, for 

1 < A: < 5 - (Thus Skc will be the length of the list accessed from hk c .) 

We can store a “signature” 2 C-1 [ 加 e[a,c] — 0] with each node a of the trie. 

For example, the signature for node 260 is 2 0 +2 4 +2 8 +2 14 +2 17 +2 20 +2 24 = #1124111, 
according to ( 11 ); here a — 1，…， z —> 26. 

The process of running through all x that match a given signature y with respect 
to position 么 , as needed in steps B2 and B4, now takes the following form: (i) Set 
i •<— 0. (ii) While 2 Z & ^ / = 0, set i •<— « + 1. (iii) Set x •<— go to (vi) if x = 0. 

(iv) Visit x. (v) Set x ^r- l xz \ go to (iv) if x 7 ^ 0. (vi) Set i •<— i - 1 -1; go to (ii) if 2 l < y. 

Let trie [a, 0] be the signature of node a. We choose 2： and y = trie[a^^i) z ^0] in 
step B2 so that the number of nodes to visit, Szc[2 c 一 1 is minimum for 

1 < z < 5. For example, when / = 3, xi = 1446, and X 2 = 185 as in ( 10 )，that sum for 

^ — 1 is 5ii+5i5 + 5i9+5i(i5)+5i(i8)+5i(2l)+5i(25) = 296+ 129+74+108+268+75+47 = 
997; and the sums for 2： = 2 ， 3, 4, 5 are 4722 ， 1370 ， 5057， and 1646. Hence we choose 
z = 1 and y = # 1124111; only 997 words, not 5757, need be tested for x^. 

The values yi and zi are maintained for use in backtracking. (In practice we keep 
x, y, and 2： in registers during most of the computation. Then we set xi t x, yi t y, 
zi ^ z before increasing / ^ + 1 in step B3; and we set x xi^ y yi^ z ^ zi in 
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step B5 - We also keep i in a register, while traversing the sublists as above; this value 
is restored in step B5 by setting it to the zth letter of word decreased by ’a’.) 

26. Here are the author’s favorite 5x7 and 5x8, and the only 5 x 9’s: 
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agentival 
coelomate 
undeleted 
oysterers 


No 5 x 10 word rectangles exist, according to our ground rules. 

27. (1, 15727, 8072679, 630967290 90962081, 625415) and (15727.0, 4321.6, 1749.7, 
450.4, 286.0). Total time s 18.3 teramems- (In Section 7-2.2.1 we’ll study a method 
that is symmetrical between rows and columns.) 


28. Build a separate trie for the m-letter words; but instead of having trie nodes of 
size 26 as in (li) ， it’s better to convert this trie into a compressed representation that 
omits the zeros. For example, the compressed representation of the node for prefix 
‘corne^ in ( 12 ) consists of five consecutively stored pairs of entries (‘a’ ， 3879), (‘d ’， 
3878), (‘1’, 9602), (‘r’, 171), (‘t’ ， 5013)，followed by (0,0). Similarly, each shorter prefix 
with c descendants is represented by c consecutive pairs (character,link), followed by 
(0, 0) to mark the end of the node. Steps B3 and B4 are now very convenient- 

Level l corresponds to row = 1 + (/ — 1) mod m and column j/ = 1 + [(I — l)/mj. 
For backtracking we store the n-trie pointer ai t j t as before，together with an index x\ 
into the compressed m-trie. 

This method was suggested by Bernard Gattegno in 1996 (unpublished). It finds 
all 5 x 6 word rectangles in just 400 gigamems; and its running time for “transposed” 
6x5 rectangles turns out to be slightly less (380 gigamems). Notice that only one mem 
is needed to access each (character, link) pair in the compressed trie. 


29• Leave out face and (of course) dada; the remaining eleven are fine. 

30. Keep tables p 。 p\j , 〆/# ，〜， s\j , s’;j k , for 0 < i^j^k < m，each capable of storing a 
ternary digit. Also keep a table xo, xi，• • • of tentatively accepted words- Begin with 
^ 0- Then for each input Wj = abed, where 0 乞 a ，6， c，d < m, set x g abed and 

also do the following: Set p a ^ Pa ^ 1, Pab Pab + Pabc Pabc + 工 ，〜 i 〜 + 1, 
s cd <— s cd + 1, sl cd ^ sl cd + 1, where x -\r y = min(2, x + y) denotes saturating ternary 
addition. Then if s a fp f ^ c f d / + s a , bl p cld , + s a , b f c f Pci , = ^ f° r all Xk = ab f cd\ where 
^ ^ k < set g ^ g 1. Otherwise reject Wj and set p a l p a — h p’ ab l p’ ab — 1, 

Pabc ^ Pabc — 1 ， 〜卜〜 — 1 ， s cd 4 — 1 ， s bcd ^ s bcd ~ 1- 

31. (a) The word be appears in message abed if and only if a 4 6, 6 4 c, and c ^ d. 
(b) For 0 < k < put vertex v into class k if the longest path from v has 

length k. Given any such partition, we can include all arcs from class k to class j < k 
without increasing the path lengths. So it’s a question of finding the maximum of 

Yjo<j<k<rPJP k subject to po+pH - Vp r -i = m. The values pj = \_{m+j)/r\ achieve 

this (see exercise 7.2.1.4—68(a)). When r = 3 the maximum simplifies to 卜 2 /3」. 


32. (a) The factors of the period, 15 926 535 89 79 323 8314, begin at the respective 
boundary points 3, 5, 8 , 11, 13, 15, 18 (and then 3 + 19 = 22, etc.). Thus round 1 
retains boundaries 5, 8 , and 15. The second-round substrings yo = 926, yi = 5358979, 
y 2 = 323831415 have different lengths, so lexicographic comparison is unnecessary; the 
answer is y 2 yoyi = xi 5 •… x 3 3 - 

(b) Each substring consists of at least three substrings of the previous round. 
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(c) Let ao = 0, 60 = 1 ， a e +i = a e a e b e ， 6 e +i = a e b e b e \ use a e or b e when n = 3 e . 

(d) We use an auxiliary subroutine c less(i)\ which returns [yi-i <Vi]^ given i > 0: 
If bi —bi—i— bi+i —bi, return [bi — bi 一 i < 6 《+i — bi]. Otherwise，for j = 0, 1， ••” while 
bi + j < 6 i+i，if x bi _ 1+j ^ x bi+j return 

The tricky part of the algorithm is to discard initial factors that aren’t periodic- 
The secret is to let io be the smallest index such that yi -3 > yi 一 2 < 队 ― i; then we can 
be sure that a factor begins with yi. 

Ol. [Initialize.] Set Xj Xj- n for n < j < 2n, bj ^r- j for 0 < j < 2n, and t n. 

02. [Begin a round.] Set i! 0. Find the smallest i > 0 such that less(i) = 0. 
Then find the smallest j >i + 2 such that less(j — 1) = 1 and j <t + 2. (If 
no such j exists，report an error: The input x was equal to one of its cyclic 
shifts.) Set i io j mod t. (Now a dip of the period begins at io-) 

03. [Find the next factor.] Find the smallest j > i + 2 such that less (j — 1) = 1. 
If j — i is even, go to 05- 

04. [Retain a boundary.] If j < t, set b f t , ^ bj] otherwise set b f k ^ b f k _ 1 for 
t f > k > 0 and 60 bj-t. Finally set t f t f + 1. 

05. [Done with round?] If j < io + ^ set i t j and return to 03. Otherwise, if 

i! — 1, terminate; ax begins at item x h ^. Otherwise set t ^ t\ bk ^ b f k for 
Q < k < t, and 6 /c 1 bk 一 t + n for A: > t while bk—t < 化 Return to 02. | 

(e) Say that a “superdip” is a dip of odd length followed by zero or more dips of even 

length. Any infinite sequence y that begins with an odd - length dip has a unique factor¬ 
ization into superdips. Those superdips can，in turn, be regarded as atomic elements 
of a higher-level string that can be factored into dips. The result crx of Algorithm O 
is an infinite periodic sequence that allows repeated factorization into infinite periodic 
sequences of superdips at higher and higher levels，until becoming constant. 

Notice that the first dip of ax ends at position io in the algorithm, because its 
length isn’t 2. Therefore we can prove the commafree property by observing that，if 
codeword ax’’ appears within the concatenation crxcrx of two codewords, its superdip 
factors are also superdip factors of those codewords. This yields a contradiction if any 
of ax^ ax\ or ax n is a super dip. Otherwise the same observation applies to the superdip 
factors at the next level- [Eastman’s original algorithm was essentially the same, but 
presented in a more complicated way; see IEEE Transactions IT-11 (1965) ， 263-267. 
R. A. Scholtz subsequently discovered an interesting and totally different way to define 
exactly the same set of codewords，in IEEE Transactions IT-15 (1965), 300-306.] 

33. Let //c(m) be the number of dips of length k for which m > zi and Zk < m. The 

number of such sequences with 2:2 = j is (m — j — 1 ) ( m_ ^2 _3 ) = (k — 1 ) ( m- ^f -3 )； 
summing for 0 < j < m gives fk(m) = (A:-l) ( m+ ^ -2 ). Thus F m (z) = fk( m ) zk — 

[mz — 1)/(1 — z) m . (The fact that fo(m) = —1 in these formulas turns out to be useful!) 

Algorithm O finishes in one round if and only if some cyclic shift of x is a super dip. 

The number of aperiodic x that finish in one round is therefore n[z n ] G m {z)^ where 

r \ _ F m (_z) — F m (z) — (1 + mz)(l — z) m — (1 — mz)(l H- z) m 

m Z F m (-z) + F m (z) (1 + mz)(l - z) m -1- (1 - mz){l + z) m ' 

To get the stated probability, divide by ^2 d \ n 卩 (d) 讯 7 ^ d , the number of aperiodic x. 
(See Eq. 7.2.1.1—( 6 o). For n = 3, 5, 7, 9 these probabilities are 1 ， 1 ， 1， and 1—3/d 一 丄 ).) 

34. If so, it couldn’t have 0011, 0110 ， 1100, or 1001- 


Otherwise return 0. 
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35. That section considered such representations of stacks and queues, but not of 
unordered sets，because large blocks of sequential memory were either nonexistent or 
ultra-expensive in olden days. Linked lists were the only decent option for families of 
variable-size sets，because they could more readily fit in a limited high-speed memory. 

36. (a) The blue word x with a = d (namely 1101) appears in its P2 list at location 5e. 
(b) The P3 list for words of the form 010* is empty. (Both 0100 and 0101 are red.) 


stacks 

queues 

memory constraints, historic 
unordered sets 
Linked lists 
deletion 


37. (a) The S2 list of 0010 has become closed (hence 0110 and 1110 are hidden). 

(b) Word 1101 moved to the former position of 1001 in its SI list，when 1001 
became red. (Previously 1011 had moved to the former position of 0001.) 

38. In this case, which of course happens rarely, it’s safe to set all elements of STAMP 
to zero and set a ^ 1. (Do not be tempted to save one line of code by setting all STAMP 
elements to —1 and leaving a = 0. That might fail when a reaches the value —1!) 

39. (a) Set r m + 1. Then for A: 0 ， 1，…， / — 1， set 尤 FREE [A:], j <— 

MEM[CL0FF + + m 4 ] — (CL0FF + 4t)，and if j < r set r t j ， c t t' break out of the 

loop if r = 0. 

(b) If r 〉 0 set x — MEM [CL0FF + 4cZ(ALF Dr])] • 

(c) H r > 1 set g l 0 ， p’ 4 - MEM[PP]，and p POISON. While p < p do the 
following steps: Set y ^r- MEM \_p]^ z MEM[p + 1], 〆 —■ MEM[y + m 4 ]，and z ^r- 
MEM [z + m 4 ]. (Here y and 2 ： point to the heads of prefix or suffix lists; y and z point 
to the tails.) Ii y = y or z = z\ delete entry p from the poison list; this means, as 
in ( 18 )，to set p’ <- p —2^ and Up ^ p to store(p, MEM [p 7 ]) and store(p+l，MEM Ip + 1] )• 
Otherwise set p p+2; if y —y > z —z and y —y > set q ^ y —y and x ^r- MEM [z ]; 
if y — y < z — z and z — z > set q z — z and x <- MEM [t/]. Finally，after p has 
become equal to p’ ， store(PP ， p’）and set c <r- c/(ALF M). (Experiments show that this 
“max kill” strategy for r > 1 slightly outperforms a selection strategy based on r alone.) 

40. (a) First there’s a routine rem(a ， J ， o) that removes an item from a list，following 
the protocol ( 21 ): Set p 卜 J + o and q MEM Ip + m 4 ] — 1. If ^ > p (meaning that 
list p isn’t closed or being killed), store(p + m 4 ， g)，set t MEM la + o — m 4 ]; and if 
t + q also set y MEM [g] ， store y)^ and store(ALF \_y~\ + o — m 4 , t). 

Now, to redden x we set a <— ALF \_x ], store(a ， RED); then rem(a，pi (a), P10FF )， 
rem(a ， p‘ 2 (o ； ) ， P20FF)， ••” rem(a, 53 (ct), S30FF), and rem(a ， 4cZ(a) ， CL0FF). 

(b) A simple routine closeo) closes list J+o: Set p ^ J+o and q ^ MEM Ip 4 - m 4 ]; 
if g — p — 1 ， store(p + m 4 ，p — 1). 

Now，to green x we set a ^r- ALF|>]，store (a, GREEN); then dose(pi (a) ， P10FF )， 
dose(p2(o0, P20FF)， ••” close(S 3 (o;) ， S30FF)，and close(4c/(a), CL0FF). Finally, for p < 
r < q (using the p and q that were just set within ‘dose’)，if MEM [r] ^ x redden MEM [r] • 

(c) First set p ^ MEM [PP] + 6, and store(p 7 — 6, pi (a) + S10FF) ， stored — 5, 53 (a)+ 
P30FF) ， store(p — 4 ， p2(ct) + S20FF) ， store(p’ — 3, 52 (a) +P20FF) ， store(p’ — 2 ， p3(a) + 
S30FF), store(p — 1, si(a) + P10FF); this adds the three poison items ( 27 ). 

Then set p POISON and do the following while p < p: Set y\ z as in 

answer 39(c)，and delete poison entry p if y = y’ or z = z . Otherwise if y < y and 
z < 2 ：, go to C6 (a poisoned suffix-prefix pair is present)- Otherwise if y > y and 
z > 2 ：, set p *<— p + 2. Otherwise \i y <y and z > 2 ：, store( 2 ： + m 4 , 2 ：)，redden MEM [r] 
for z < r < z ^ and delete poison entry p. Otherwise (namely if y f > y and z < z\ 
store(t/ + m 4 , y), redden MEM [r] for y < r < y f ^ and delete poison entry p. 

Finally，after p has become equal to p’ ， store(PP 5 p)- 
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42. Exercise 32 exhibits such codes explicitly for all odd n. The earliest papers on 
the subject gave solutions for n = 2, 4, 6 , 8 . Yoji Niho subsequently found a code for 
n = 10 but was unable to resolve the case n = 12 [IEEE Trans. IT-19 (1973) ， 580—581]. 

This problem can readily be encoded in CNF and given to a SAT solver. The 
case n = 10 involves 990 variables and 8.6 million clauses, and is solved by Algo¬ 
rithm 7.2.2.2C in 10-5 gigamems. The case n = 12 involves 4020 variables and 175 
million clauses. After being split into seven independent subproblems (by appending 
mutually exclusive unit clauses), it was proved unsatisfiable by that algorithm after 
about 86 teramems of computation. 

So the answer is “No •” The maximum-size code for n = 12 remains unknown. 
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44. (a) There are 28 commafree binary codes of size 3 and length 4; Algorithm C 
produces half of them, because it assumes that cycle class [ 0001 ] is represented by 
0001 or 0010- They form eight equivalence classes, two of which are symmetric under 
the operation of complementation - and-reflection; representatives are { 0001 , 0011 , 0111 } 
and {0010, 0011 ， 1011}. The other six are represented by {0001 ， 0110, 0111 or 1110 }， 
{ 0001 , 1001,1011 or 1101 }, { 0001 , 1100 , 1101 }, { 0010 , 0011 , 1101 }. 

(b) Algorithm C produces half of the 144 solutions, which form twelve equivalence 
classes. Eight are represented by {0001, 0002, 1001, 1002, 2201, 2001, 2002, 2011, 
2012 , 2102 , 2112 , 2122 or 2212 } and ({ 0102 , 1011 , 1012 } or { 1020 , 1101 , 2101 }) and 
({ 1202 , 2202 , 2111 } or { 2021 , 2022 , 1112 }); four are represented by { 0001 , 0020 , 0021 , 
0022 , 1001 , 1020 , 1021 , 1022 , 1121 or 1211 , 1201 , 1202 , 1221 , 2001 , 2201 , 2202 } and 
({ 1011 , 1012 , 2221 } or { 1101 , 2101 , 1222 }). 

(c) Algorithm C yields half of the 2304 solutions，which form 48 equivalence classes. 

Twelve classes have unique representatives that omit cycle classes [0123], [0103] ， [1213 ]， 
one such being the code {0010, 0020, 0030, 0110, 0112, 0113, 0120, 0121, 0122, 0130, 
0131, 0132, 0133, 0210, 0212, 0213, 0220, 0222, 0230, 0310, 0312, 0313, 0320, 0322, 

0330, 0332, 0333, 1110, 1112, 1113, 2010, 2030, 2110, 2112, 2113, 2210, 2212, 2213, 

2230, 2310, 2312, 2313, 2320, 2322, 2330, 2332, 2333, 3110, 3112, 3113, 3210, 3212, 

3213 ， 3230, 3310, 3312 ， 3313}- The others each have two representatives that omit 
classes [0123] ， [0103], [0121]，one such being the code {0001 ， 0002 ， 0003 ， 0201, 0203, 
1001, 1002, 1003, 1011, 1013, 1021, 1022, 1023, 1031, 1032, 1033, 1201, 1203, 1211, 

1213, 1221, 1223, 1231, 1232, 1233, 1311, 1321, 1323, 1331, 2001, 2002, 2003, 2021, 

2022, 2023, 2201, 2203, 2221, 2223, 3001, 3002, 3003, 3011, 3013, 3021, 3022, 3023, 

3031, 3032, 3033, 3201, 3203, 3221, 3223, 3321, 3323, 3331} and its isomorphic image 

under reflection and (01)(23). 

45. (The maximum size of such a code is currently unknown. Algorithm C isn’t fast 
enough to solve this problem on a single computer, but a sufficiently large cluster of 
machines and / or an improved algorithm should be able to discover the answer. The 
case m = 3 and n = 6 is also currently unsolved; a SAT solver shows quickly that a full 
set of (3 6 — 3 3 — 3 2 + 3i)/6 = 116 codewords cannot be achieved.) 

47. The 3-bit sequences 101, 111, 110 were rejected before seeing 000. In general, to 
make a uniformly random choice from q possibilities，the text suggests looking at the 
next t = 卩 g gl bits bi •••bt. If (6i • • • 6 亡 ）2 < g ， we use choice (6i . •. bt )2 + 1 ； otherwise 
we reject bi … bt and try again. [This simple method is optimum when ^ < 4, and the 
best possible running time for other values of q uses more than half as many bits. But a 
better scheme is available for g = 5, using only 3| bits per choice instead of 4|; and for 
g = 6, one random bit reduces to the case q = 3. See D. E. Knuth and A. C. Yao, Al¬ 
gorithms and Complexity^ edited by J- F. Traub (Academic Press, 1976) ， 357—428, §2.] 
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48. It’s the number of nodes on level l + 1 (depth l) of the search tree. (Hence we can 
estimate the profile. Notice that D = Di ... Di-i in step E2 of Algorithm E.) 

49. Zo = (7(), = c() + Dic(Xi) + ^ 1 ^ 020 (X 1 X 2 ) + ... + J9i... Dic(Xi ... Xi) + 

Di ... DiJ r \C{X\ ... X/+i). 

50. (a) True: The generating function is z(z+ 1)... (z + n — l)/nl; see Eq. 1.2.10-(g). 

(b) For instance, suppose Y 1 Y 2 .. .Yi = 1457 and n = 9. Alice’s probability is 

I 士誉暑 + 备暑 =mi. Elmo obtains X 1 X 2 •.. Xi = 7541 with probability H 去 

(c) The upper tail inequality (see exercise 1.2.10—22 with \i = H n ) tells us that 
Pr (/ > (In n)(ln In n)) < exp(—(Inn) (In Inn) (In In Inn) + O (Inn) (In Inn)). 

(d) If A: < n/3 we have (?) S 2Q). By exercise 1.2.6-67, the number of nodes 
on the first (In n) (In In n) levels is therefore at most 2(ne/((lnn)(lnlnn)))( lnn )( llllnn ). 


51. The key idea is to introduce recursive formulas analogous to ( 29 ): 
m(xi . .. xi) = c{x\ . .. xi) + min(m(xi . - . xix^^d^ •…， m{xi . .. xix^^d); 
M(xi . . . xi) = c{x\ .. . xi) + max(M(xi - . - xix^^d^ •…， M(xi . .. xix^^d); 

d 

C(xi... xi) = c(xi... xi) 2 + … xix^^d + 2c(xi... xi)C{xi ... 


They can be computed via auxiliary arrays MIN, MAX, KIDS, COST, and CHAT as follows: 

At the beginning of step B2, set MIN[/] *f- 00 , MAX[/] KIDS [Z] I C0ST[/] <— 
CHAT[/] <— 0. Set KIDS [/] KIDS [/] + 1 just before / -f- / + 1 in step B3- 

At the beginning of step B5，set m 4 - c(xi ... x/_i) + KIDS[/] x MIN [/], M ^ 
c(xi ... xi-i) + KIDS [/] x MAX [0 , (7 c(xi ... X 1 - 1 ) + COST [/] , (7 — c(xi . .. x/_i) 2 + 


KIDSm X CHAT[/] + 2 x C0ST[n. Then，after l l - 1 is positive，set MINE/] ^ 
min(m,MIN [/]), MAX[/] max(M, MAX [/]), COSTm C0ST[/] + C, CHAT [Z] I 

八 八） 

CHAT [/] + C. But when / reaches zero in step B5, return the values m，M， C, C — C • 


52. Let p(i) = px 1 ...x i _ l (仏)， and simply set D D/p{I) instead oi D Dd. Then 
node xi ... xi is reached with probability IT(xi ... xi) = p(xi)p Xl (^ 2 ) - - -p Xl ...xi 
and c(xi ... xi) has weight l/II{xi ... x{) in S] the proof of Theorem E goes through 
as before- Notice that p(I) is the a posteriori probability of having taken branch I. 

(The formulas of answer 51 should now use c /p(i)^ instead of and that algorithm 
should be modified appropriately, no longer needing the KIDS array.) 

53. Let px 1 ...x l __ 1 (yi) = C(xi ... xi-iyi)/{C{xi ... xi-i) - c{x\... xi-i)). (Of course 
we generally need to know the cost of the tree before we know the exact values of these 
ideal probabilities, so we cannot achieve zero variance in practice- But the form of this 
solution shows what kinds of bias are likely to reduce the variance.) 


55. The effects of lookahead, dynamic ordering，and reversible memory are all captured 
easily by a well-designed cost function at each node. But there’s a fundamental 
difference in step C2, because different codeword classes can be selected for branching 
at the same node (that is，with the same ancestors xi ... xi-i) after C5 has undone 
the effects of a prior choice. The level l never surpasses 1/ + 1， but in fact the search 
tree involves hidden levels of branching that are implicitly combined into single nodes- 
Thus it’s best to view Algorithm C’s search tree as a sequence of binary branches: 
Should x be one of the codewords or not? (At least this is true when the “max kill” 
strategy of answer 39 has selected the branching variable x. But if r > 1 and the poison 
list is empty, an r-way branch is reasonable (or an (r + l)-way branch when the slack 
is positive)，because r will be reduced by 1 and the same class c will be chosen after x 
has been explored.) 
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If x has been selected because it kills many other potential codewords, we probably 
should bias the branch probability as in exercise 52, giving smaller weight to the “yes” 
branch because the branch that includes x is less likely to lead to a large subtree. 

57. Let pk = 1/D^ be the probability that Algorithm E terminates at the kth. leaf. 

Then ^^^1/M) lg(l/(Mp/c)) is the Kullback-Leibler divergence D{q\\p)^ where q is 
the uniform distribution (see exercise MPR-121). Hence (The 

result of this exercise is essentially true in any probability distribution.) 

58. Let oo be any convenient value > n. When vertex v becomes part of the path we 
will perform a two-phase algorithm. The first phase identifies all “tarnished” vertices, 
whose DIST must change; these are the vertices u from which every path to t passes 
through v. It also forms a queue of “resource” vertices, which are untarnished but 
adjacent to tarnished ones. The second phase updates the DISTs of all tarnished vertices 
that are still connected to t. Each vertex has LINK and STAMP fields in addition to DIST. 

For the first phase, set d <— DIST (v) , DIST (v) l oo + l，R l A ， T l r ， LINK(^) ^ 
A，then do the following while T _ A: (*) Set 乜 4- T, T S l A. For each w —— if 
DIST(w;) < d do nothing (this happens only when u = v)] if DIST (w) > oo do nothing 
(w is gone or already known to be tarnished); if DIST(w;) = d，make w a resource (see 
below); otherwise DIST(w;) = d+ 1. If w has no neighbor at distance w is tarnished: 
Set LIW(w) T, DIST(w;) i oo, T ^ w. Otherwise make w a resource (see below). 
Then set u <— LINK(m )， and return to (*) if u ^ A. 

The queue of resources will start at R. We will stamp each resource with v so 
that nothing is added twice to that queue. To make w a resource when DIST(w) = 
do the following (unless u = v or STAKP(w) = v): Set STAMP(w;) if R = A, set 

R ^ KI ^ w] otherwise set LINK(RT) ^ w and RT •(— w. To make w a resource when 
DIST (w) = d + 1 and u + v and STAMP(w;) ^ put it first on stack S as follows: Set 
STAMP (w;) if S = A, set S ^ SB ^ w; otherwise set LINK(w;) ― • S，S w;. 

Finally，when 乜 =A, we append S to R: Nothing needs to be done if S = A. 

Otherwise, if R = A，set R l S and RT SB; but if R _ A，set LINK(RT) i- S and 

RT SB. (These shenanigans keep the resource queue in order by DIST.) 

Phase 2 operates as follows: Nothing needs to be done if R = A. Otherwise we set 
LINK(RT) A, S A, and do the following while R A or S ^ A: (i) If S = A ， set 
d DIST (R). Otherwise set u <— S, d <— DIST(u)，S l A; while u ^ update the 
neighbors of u and set u ^ LINK ( 乜 ） • (ii) While R _ A and DIST(R) = set 乜 l R, 
R LINK(m) ，and update the neighbors of u. In both cases “update the neighbors 
of means to look at all w —— u, and if DIST (w) = oo to set DIST(w) l d + 1 ， 

STAMP(w;) <— LIW(w) S, and S ^ w. (It works!) 

59. (a) Compute the generating function g{z) (see exercise 7.1.4-209) and then 

(b) Let (A ， B ， (7) denote paths that touch (center, NE corner, SW corner). Re¬ 
cursively compute eight counts (Co, … ， C 7 ) at each node, where Cj counts paths 丌 
with j = 4 [ 7 r G A] + 2[tt G S] + [k ^C]. At the sink node [T| we have Co = 1 ， 
ci = • • • = C 7 = 0 - Other nodes have the form x = (e? x[: Xh) where e is an edge. 
Two edges go across the center and affect A; three edges affect each of B and C. Say 
that those edges have types 4, 2, 1 ， respectively; other edges have type 0. Suppose the 
counts for x/ and Xh are (c 0 ^^ c 7 ) and (cq , •… ， cf)，and e has type t. Then count Cj 
for node x is Cj + [i = 0]cj [t Sz j ^ 0 ](c;’ + c -_ t ). 

(This procedure yields the following exact “Venn diagram” set counts at the root: 

Co = \AnBnC\ = 7653685384889019648£91604; a = C 2 = p n B D C| = n J9 n 
C\ = 7755019053779199171839134; c 3 = [An BHC\ = 7857706970503366819944024; 
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c 4 = \AnBnC\ = 4888524166534573765995071; c 5 = c 6 = \AnBDC\ = \AnBnC\ = student 
4949318991771252110605148; c 7 = \AnBnC\ = 5010950157283718807987280.) 

dynamic ordering 

8xl0 25 
5xl0 25 
2xl0 25 

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 

0.5 
0.0 

10 25 
<10 20 

90. Suppose there are n questions, whose answers each lie in a given set S. A student 
supplies an answer list a = ai... a n ^ with each aj G a grader supplies a Boolean 
vector p = xi ... x n . There is a Boolean function fj S (a,p) for each j G {1 ， • • •， n} and 
each s ^ S. A graded answer list (a^/3) is valid if and only if F(a, /3) is true，where 



60. Yes，the paths are less chaotic and the estimates are better: 



F(a^j3) = F(ai ••• a n . 



n 

A 八 ([ a i 

i=i ses 


^ x j = j 3 ))- 


The maximum score is the largest value of n + • • • + over all graded answer lists 
(a ， /3) that are valid- A perfect score is achieved if and only if F(a^ 1 … 1) holds. 

Thus，in the warmup problem we have n = 2, S = {A ， B}; /ia = [a ‘2 = B]; 
fiB = [ai = A]; / 2 A = xi ； / 2 B = 无 2 ㊉ [ai = A]. The four possible answer lists are: 

AA: F = (xi = [A = B]) A (x 2 = x\) 

AB: F = (xi 三 [B = B]) A ( 尤 2 三元 2 © [A = A]) 

BA: F = (xi = [B = A]) A {x 2 = xi) 

BB: F = (xi = [B = A]) A (x2 = xi ㊉ [B = A]) 


Thus A A and BA must be graded 00; AB can be graded either 10 or 11; and BB has 
no valid grading. Only AB can achieve the maximum score, 2; but 2 isn’t guaranteed- 
In Table 666 we have, for example，/ic = [a ‘2 7 ^ A] A \az = A]; / 4 D = [ai =D] A 


[cii5 = D]; fi2A — [Sa — 1 ― Sg] •) where Ss = 

that / 14 E = [{Sa ， … ， Se} = {2, 3,4, 5, 6}]. 


Ex 


<?<20 




It’s amusing to note 


The other cases are similar (although often more complicated) Boolean functions —— 


except for 20D and 20E, which are discussed further in exercise 91- 


Notice that an answer list that contains both 10E and 17E must be discarded: It 


can’t be graded, because 10E says 4io 三 while 17E says ($ 17 三 x\q\ 

By suitable backtrack programming, we can prove first that no perfect score is 
possible. Indeed, if we consider the answers in the order (3, 15, 20 ， 19, 2 ， 1 ， 17, 10, 5, 
4, 16, 11, 13, 14, 7, 18, 6 , 8 , 12, 9)，many cases can quickly be ruled out. For example, 
suppose «3 = C. Then we must have ⑽ ♦ ai6 _ an = a'% ♦ aig ^ a 2 o, 
and early cutoffs are often possible. (We might reach a node where the remaining 
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(The incorrect answers are shown here as lowercase letters. The first two solutions 
establish the truth of 20B and the falsity of 20E.) 

91. Now there’s only one list of answers with score > 19, namely (iii). But that is 
paradoxical —— because it claims 20E is false; hence the maximum score cannot be 19! 

Paradoxical situations are indeed possible when the global function F of answer 90 
is used recursively within one or more of the local functions fj s . Let’s explore a bit of 
recursive territory by considering the following two - question, two-letter example: 

1* (A) Answer 1 is incorrect- (B) Answer 2 is incorrect- 

2. (A) Some answers can’t be graded consistently. (B) No answers achieve a perfect score- 

Here we have /ia = xi] /ib = X2] /2A = 彐 ai 彐 /2B = 
Vai Va *2 - ] F(aia2 ,11). (Formulas quantified by 彐 a or Va expand into \S\ terms，while 彐 x 
or \Jx expand into two; for example, 彐 aVxp(a ， x)= ( 分 (A ， 0) 八 g(A ， 1)) V(g(B, 0) 八 g(B ， 1)) 
when S = {A ， B}.) Sometimes the expansion is undefined, because it has more than 
one “fixed point” ； but in this case there’s no problem because / 2 A is true: Answer A A 
can’t be graded, since 1A implies x\ 三 xi. Also / 2 B is true, because both BA and BB 
imply xi 三 X 2 - Thus we get the maximum score 1 with either BA or BB and grades 01. 

On the other hand the simple one-question, one-letter questionnaire c l. (A) The 
maximum score is 1’ has an indeterminate maximum score. For in this case /ia = 
F(A ， 1). We find that if F(A ， 1) = 0, only (A ， 0) is a valid grading，so the only possible 
score is 0; similarly, if F(A ， 1) = 1， the only possible score is 1- 

OK, suppose that the maximum score for the modified Table 666 is m. We know 
that m < 19; hence (iii) isn’t a valid grading. It follows that 20E is true, which means 
that every valid graded list of score m has X20 false. And we can conclude that m = 18, 
because of the following two solutions (which are the only possibilities with 20C false) : 

12345678 9 10 11 12 13 14 15 16 17 18 19 20 

BAdABEDCDAEDAEDEDBE c 

AEDCABCDCACEDB aCDAA c 


choices for answers 5 ， 6 ， 7 ， 8， 9 are respectively {C ， D} ， {A ， C} ， {B ， D} ， {A ， B ， E }， 
{B ， C ， D} ， say- Then if answer 8 is forced to be B, answer 7 can only be D; hence 
answer 6 is also forced to be A. Also answer 9 can no longer be B.) An instructive little 
propagation algorithm will make such deductions nicely at every node of the search 
tree. On the other hand, difficult questions like 7, 8 ， 9, are best not handled with 
complicated mechanisms; it’s better just to wait until all twenty answers have been 
tentatively selected, and to check such hard cases only when the checking is easy and 
fast- In this way the author’s program showed the impossibility of a perfect score by 
exploring just 52859 nodes, after only 3.4 megamems of computation. 

The next task was to try for score 19 by asserting that only Xj is false. This turned 
out to be impossible for 1 < j < 18, based on very little computation whatsoever 
(especially, of course, when j = 6). The hardest case, j = 15, needed just 56 nodes and 
fewer than 5 kilomems- But then, ta da，three solutions were found: One for j = 19 (185 
kilonodes，11 megamems) and two for j = 20 (131 kilonodes，8 megamems), namely 
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But wait: If m = 18, we can score 18 with 20A true and two errors, using (say) 

12345678 9 10 11 12 13 14 15 16 17 18 19 20 
DeDABEDeCAEDAEDBDCCA 
or 47 other answer lists. This contradicts m = 18, because x ‘20 is true. 

End of story? No. This argument has implicitly been predicated on the assumption 
that 20D is false. What if m is indeterminate? Then a new solution arises 

12345678 9 10 11 12 13 14 15 16 17 18 19 20 

DCEABEDCEAEBAEDBDAdD 

of score 19. With (iii) it yields m = 19! If m is determinate, weVe shown that m 
cannot actually be defined consistently; but if m is indeterminate, it’s definitely 19. 


Question 20 was designed to create difficulties. [: -)] 

—— DONALD R. WOODS (2001) 


95 . (a) Let Xk be the number of nodes at distance k — 1 from the root. 

(b) Let Q^ m ) = + ••• + Pn m \ Then we have the joint recurrence Pf 


) 


1 


P 


(m) 


n+1 


= Qn m ^\ in particular, Q[ m ^ = m. And for n > 2, we have a nk(:) 

for certain constants a n k that can computed as follows: Set tk P》) for 1 < k < n. 
Then for A: = 2, • • •, n set t n ^ t n — t n -i, ••” tk t tk — tk—i. Finally a n k ^r- tk for 
1 < A: < n- For example, a ‘21 = a‘ 2‘2 = 2; a 3 i = 6, 032 = 14, 033 = 8 - The numbers 
have 0(n 2 + n log m) bits，so this method needs O(n 0 ) bit operations to compute P n . 


(c) i^ m ) corresponds to random paths with X\ = m, Dk = 2Xk, = \2UkXk\^ 

where each [4 is an independent uniform deviate. Therefore = E(Di … D n —i) 

is the number of nodes on level n of an infinite tree. We have X^+i > 2 k U：i … t4m ， 

by induction; hence > E(2 ⑵ [^ 一 2 % 71-3 … U l n _ 2 m n - 1 ) = 2 ⑵ m n _V(n - 1)!. 


[M. Cook and ML Kleber have discussed similar sequences in Electronic Journal 
of Combinatorics 7 (2000), #R44. See also K. Mahler’s asymptotic formula for binary 
partitions, in J. London Math. Society 15 (1940), 115—123，which shows that lgP n = 


G)—lg(n — l)! + ( lg /)+0(l).] 


999 - 
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He writes indexes to perfection. 
— OLIVER GOLDSMITH, Citizen of the World (1762) 

When an index entry refers to a page containing a relevant exercise, see also the answer to 
that exercise for further information. An answer page is not indexed here unless it refers to a 
topic not included in the statement of the exercise. 


2-letter block codes, 27- 

4- letter codewords, 9—18, 27-29- 

5- letter words of English，8-9, 27. 

6- letter and /c-letter words of English, 8-9. 
7 (Euler’s constant)，as source of 

“random” data, 19. 

7r (circle ratio), as source of “random” 
data, 19—20, 22, 28. 

4> (golden ratio), as source of “random” 
data, 19- 

A posteriori probability, 39- 
Active elements of a list, 12. 

Ahrens, Joachim Heinrich Liidecke, 

6, 26, 33- 

Analysis of algorithms, 28, 29. 

Aperiodic words, 10, 27, 36. 

Backjumping, 24. 

Backmarking, 24. 

Backtrack programming, 2-oo. 
efficiency of, 32. 
history of， 2, 5-6, 24-25. 
introduction to, 2-31- 
Backtrack trees， 3, 4, 7, 9—11，18—20， 

38, 39, 41. 

estimating the size of, 20, 38. 

Baumert, Leonard Daniel, 24, 45- 
Bees, queen, 26- 

Bernoulli，Jacques (= Jakob = James), 24. 
Bezzel, Max Friedrich Wilhelm, 24. 

Biased random walks, 29, 40. 

Binary partitions, 31. 

Binomial trees, 21. 

Bitner, James Richard, 6, 24. 

Bitwise operations, 5, 26, 34. 

Block codes, 9, 27- 
Boundary markers, 28- 
Breadth-first search, 33. 

Breaking symmetry, 8, 14. 

British National Corpus, 8. 

Broken diagonals, see Wraparound- 
Bumping the current stamp, 16, 28. 

Bunch, Steve Raymond, 6. 


Cayley, Arthur, 31. 

Cells of memory, 11. 

Chatterjee，Sourav 偷荀)， 22. 

Chessboard, 2-6, 22-24, 26, 30. 

Chiral symmetry, 26- 
Closed lists, 14. 

CNF: Conjunctive normal form, 38. 
Codewords, commafree, 9—18, 27-29. 

Coin flipping, 29. 

Combinations, 26- 

Commafree codes, 9-18, 24, 27-29. 

Compilers, 15- 

Complexity of calculation, 38. 

Compressed tries, 35. 

Computational complexity, 38. 
Concatenation, 9. 

Cook, Matthew Makonnen, 43. 
Corner-to-corner paths, 22-23, 29-30- 
Cost function, 19, 39. 

Crick, Francis Harry Compton, 9. 

Cumulative binomial distribution, 39. 

Cutoff principle, 7. 

Cutoff properties, 2, 5，10，18, 26. 

Cyclic shifts, 10, 27- 

Dancing links, 7. 

Data structures，4— 6, 9, 11-14， 18, 28. 
de Jaenisch, Carl Friedrich Andreevitch 
(ilHimrb, Kapjrb AH^peeBHHi^) ， 32. 
Degree of a node, 19. 

Deletion operation, 7, 12—13, 37. 

Depth-first search, 24. 

Diaconis, Persi Warren, 22. 

Diagonal lines (slope 士 1), 3, 33, see 
also Wraparound. 

Digraphs, 27. 

DIM ACS: DIMA CS Series in Discrete 

Mathematics and Theoretical Computer 
Science, inaugurated in 1990. 

Dips, 28. 

Discarded data, 22. 

Discrete probabilities, 1- 
Distributed computations, 6. 

Divergence, Kullback—Leibler, 40. 

Dodgson, Charles Lutwidge, iii- 
Domains, 2, 26, 27. 

Downdating versus updating, 4—5, 11, 15. 
Dual solutions, 6, 8, 27. 

Dynamic ordering, 10—11, 24, 29, 41. 


Cache-friendly data structures, 11. 
Carroll, Lewis (= Dodgson, Charles 
Lutwidge), iii- 

Cavenagh, Nicholas John, 33. 
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Dynamic shortest distances, 29- 

e, as source of “random” data, 19. 
Eastman, Willard Lawrence, 27, 28, 36. 
Eight queens problem, 3—4, 19—20, 24. 
Empty lists, 12, 17- 
English words, 8-9, 27- 
Error bars, 22. 

Estimates of run time, 18-21. 

Estimating the number of solutions, 21-23. 

Fallback points, 16. 

Five-letter words, 8-9, 27. 

Fixed point of recursive formula, 42. 

Floyd, Robert W, 15. 

Four-letter codewords, 9-18, 27-29. 

FPGA devices: Field - programmable 
gate arrays, 6. 

Frames, 11. 

Franel, Jerome, 33. 

Gaschnig, John Gary, 24. 

Gattegno, Bernard ， 35- 
GauB (= Gauss)，Johann Friderich Carl 
(=Carl Friedrich), 24, 32. 

Generating functions, 28, 39, 40. 

Gigamem (G/i): One billion memory 
accesses, 5- 
Global variables, 26. 

Goldsmith, Oliver, 44. 

Golomb, Solomon Wolf, 9, 24, 45. 

Gordon, Basil, 9, 45. 

Graders, 41* 

Griffith, John Stanley, 9. 

Hales, Alfred Washington, 45. 

Hall, Marshall, Jr” 24. 

Hamilton, William Rowan, paths, 23- 
Hammersley, John Michael ， 24. 
Handscomb, David Christopher, 24. 

Height of binary trees, 31- 
Hexagons, 26. 

Historical notes, 2, 5—6, 24-25. 
Honeycombs, 26- 
Hurwitz, Adolf, 33. 

IBM 1620 computer, 6. 

IBM System 360-75 computer, 6. 
Importance sampling, 24. 

Indeterminate statements, 42* 

Inner loops, 32. 

Insertion operation, 12. 

Integer partitions, 26, 31. 

Internet, ii, iii. 

Inverse lists, 12—15, 17. 

Inverse permutations, 12-13. 

Iteration versus recursion, 26, 32. 


Jaenisch, Carl Friedrich Andreevitch de 
(ilHmirb, Kapjrb AimpeeB^m) ， 32. 
Jewett, Robert Israel, 45. 

Jiggs, B. H. (pen name of Baumert, Hales ， 
Jewett, Imaginary, Golomb, Gordon, 
and Selfridge), 17- 

Kennedy，Michael David, 6. 

Kilomem (K/i): One thousand memory 
accesses, 41. 

King, Benjamin Franklin, Jr” 2. 

King paths, 22— 23, 25, 29-30. 

Kleber, Michael Steven, 43. 

Knight moves, 23. 

Knuth，Donald Ervin ( 高德纳 ) ， i ， iv, 

18, 24, 26, 35, 38, 42. 

Kullback, Solomon, 40. 

Langford, Charles Dudley, pairs, 6—8, 26-27. 
Laxdal, Albert Lee, 17- 
Leibler, Richard Arthur, 40- 
Lennon, John Winston Ono, 2. 
Lexicographic order, 2, 7, 24, 28. 

Linked lists, 6-7, 37- 
Lookahead, 10, 16, 24, 29. 

Loose Langford pairs, 27. 

Lucas, Frangois Edouard Anatole, 24, 33- 

Mahler, Kurt, 43. 

Martingales, 29- 
Masks ， 32. 

Megamem (M/i): One million memory 
accesses, 18. 

MEM，an array of “cells，” 11-18, 28. 

Memory constraints, historic, 36- 
Mem (//): One 64-bit memory access, 4. 
Minimum remaining values heuristic, 24. 
MMIX computer, ii. 

Monte Carlo estimates, 18-25, 29. 

Moves, 11 • 

MPR: Mathematical Preliminaries 
Redux, v, 1. 

n - letter words of English, 8. 
n queens problem, 2—6, 18—20, 24, 26. 
n-tuples, 26. 

Nauck, Franz, 24. 

Nested parentheses, 26- 
Niho, Yoji Goff ( 仁保洋二 ) ， 37. 
Nonisomorphic solutions, 29. 

Onnen, Hendrick, Sr., 6. 

Optimization, 24. 

Orgel, Leslie Eleazer, 9. 

Orthogonal lists, 27. 

OSPD4: Official SCRABBLE® Players 
Dictionary^ 8. 

Overflow of memory, 12, 16. 
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Po() ， 2. 

Paradox, 31- 
Parentheses, 26. 

Parity argument, 34. 

Partitions, 26, 31. 

Paths ， simple ， 22, 25, 29. 

Pauls, Emil, 33- 

Pencil-and-paper method, 18-20. 

Periodic sequences, 27- 
Periodic words, 10, 13- 
Permutations, 6, 26. 

Phi (0), as source of “random” data, 19. 

Pi ( 丌 )， as source of “random” data, 

19-20, 22, 28. 

Poison list, 16—17, 28-29, 39- 
Polya, Gyorgy (= George), 33- 
PreuBer, Thomas Bernd, 6. 

Prime strings, 10- 
Probabilities, 1- 

Profile of a tree, 3, 9, 27, 31, 38. 

Propagation algorithm, 42. 

Properties: Logical propositions 
(relations) ， 2, 26- 

q.s., 29. 

Quantified Boolean formulas ， 42. 

Queen bees, 26. 

Questionnaires, 30. 

Queues, 36, 40. 

Quick, Jonathan Horatio, 26. 

Radix m representation, 13- 
Random bits, 29- 
Random sampling, 18. 

Random variables, 19- 
Random walks, 18-23, 29. 

Recurrence relations, 39, 43. 

in a Boolean equation, 42. 

Recursion versus iteration, 26, 32• 

Recursive algorithms, 26, 39. 

Reflection symmetry, 14, 26, 32, see 
also Dual solutions- 
Registers, 5, 34-35. 

Reingold ， Edward Martin (力 

o»n p n 彻 ptw) ， 24. 

Rejection method, 19, 38. 

Restricted growth strings, 32. 

Reversible memory technique, 16, 29- 
Rivin, Igor (Pmbhh, Hropb EBreHbeBHH), 33. 
Root node, 19- 

Rosenbluth, Arianna Wright, 25. 

Rosenbluth, Marshall Nicholas, 25. 

Rotation by 90°, 26- 
Running time estimates, 18-21. 

Sample variance, 22. 

SAT solvers, 38. 

Saturating ternary addition, 35. 

Schumacher, Heinrich Christian, 24, 32. 


Search rearrangement, see Dynamic 
ordering. 

Search trees, 3, 4, 7, 9—11 ， 18—20, 33, 
38, 39, 42. 

estimating the size, 20, 38- 
Self-avoiding walks ， 22, 25, 29. 
Self-reference, 30, 46- 
Self-synchronizing block codes, 9. 
Selfridge, John Lewis, 45. 
Semi-queens, 33- 
Sequential allocation, 28. 

Sequential lists, 11-15. 

Set partitions, 26. 

Shortest distances, dynamic, 29. 
Signature of a trie node, 34. 

Simple paths ， 22, 25, 29. 

Slack, 18, 39. 

Sprague, Thomas Bond, 5, 6, 26. 
Stacks, 16, 36. 

Stamping, iv, 16—19, 28. 

Standard deviation, 20, 22, 29. 
Stanford GraphBase, ii, 8. 

Statistics, 22. 

Stirling, James, cycle numbers, 29. 
Students, 41. 

Substrings, 28. 

Subtrees, 20- 
Superdips, 36. 

SWAC computer, 6. 

Symmetries, 29, 32. 
breaking, 8, 14. 

Tail inequality, 39- 

Teramem (T/x): One trillion memory 
accesses, 9. 

Torus, 26. 

Tot tibi • • •, 24. 

Traub, Joseph Frederick, 38. 
TYemaux, Charles Pierre, 24. 

Tries ， 8-9, 27, 34. 

compressed, 35- 
Tuples, 26. 

Twenty Questions, 30- 
Two-letter block codes, 27. 

UCLA: The University of California 
at Los Angeles, 6. 

UNDO stack, 16. 

Undoing ， 4-5, 7, 15-16, 28. 

Uniformly random numbers, 38. 

Unit clauses, 38. 

University of California, 6. 

University of Dresden, 6. 

University of Illinois, 6. 

University of Tennessee, 6. 

Unordered sequential lists, 11. 
Unordered sets, 36. 


December 21, 2015 



INDEX AND GLOSSARY 


47 


Valid gradings ， 41. 

Vardi, Ilan, 33. 

Variance of a random variable, 22, 29- 
Venn, John, diagram, 40. 

Visiting an object, 2, 4, 5, 7, 18. 

Walker, Robert John, 2, 5, 6, 24. 
Wanless, Ian Murray, 33. 

Welch, Lloyd Richard, 9. 

Wells, Mark Brimhall, 24. 

White squares, 26. 

Woods, Donald Roy, 30, 43. 


Word rectangles, 8-9, 27. 

WORDS (n), the n most common five-letter 
words of English, 8. 

Worst-case bounds, 28. 

Wraparound, 26. 

Yao, Andrew Chi-Chih ( 姚期智 ) ， 38. 

ZDD: A zero-suppressed decision 
diagram, 23, 30. 

Zimmermann, Paul Vincent Marie, 33. 
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